以datashader为线从数据帧打印多个组

2024-05-23 19:09:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用datashader绘制图。数据本身是极坐标系中点的时间序列。我设法将它们转换成笛卡尔坐标系(具有等间距的像素轴),并且我可以用datashader绘制它们

我陷入困境的一点是,如果我只是用line()而不是points()来绘制它们,它只是将整个数据帧连接为一条线。我想将每个组的dataframe组的数据(这些组是list_of_names 中的名称)绘制到画布上作为行

可以在here中找到数据

i get this kind of image with datashader

我用datashader得到这种图像

This is a zoomed in view of the plot generated with points```´ instead of line()```` the goal is to produce the same plot but with connected lines instead of points

这是使用points()而不是line()生成的绘图的放大视图。目标是生成相同的绘图,但使用连接线而不是点


import datashader as ds, pandas as pd, colorcet
import numby as np

df = pd.read_csv('file.csv')


print(df)

starlink_name = df.loc[:,'Name']
starlink_alt = df.loc[:,'starlink_alt']
starlink_az = df.loc[:,'starlink_az']

name = starlink_name.values
alt = starlink_alt.values
az = starlink_az.values
print(name)
print(df['Name'].nunique())
df['Date'] = pd.to_datetime(df['Date'])

for name, df_name in df.groupby('Name'):
    print(name)

df_grouped = df.groupby('Name')



list_of_names = list(df_grouped.groups)
print(len(list_of_names))
#########################################################################################

#i want this kind of plot with connected lines with datashader

#########################################################################################
fig = plt.figure()
ax = fig.add_axes([0.1,0.1,0.8,0.8], polar=True)
# ax.invert_yaxis()
ax.set_theta_zero_location('N')
ax.set_rlim(90, 60, 1)
# Note: you must set the end of arange to be slightly larger than 90 or it won't include 90
ax.set_yticks(np.arange(0, 91, 15))
ax.set_rlim(bottom=90, top=0)


for name in list_of_names:
    df2 = df_grouped.get_group(name)
    ax.plot(np.deg2rad(df2['starlink_az']), df2['starlink_alt'], linestyle='solid', marker='.',linewidth=0.5, markersize=0.1)
plt.show()


print(df)
#########################################################################################

#transformation to cartasian coordiantes

#########################################################################################
df['starlink_alt'] = 90 -  df['starlink_alt']

df['x'] = df.apply(lambda row: np.deg2rad(row.starlink_alt) * np.cos(np.deg2rad(row.starlink_az)), axis=1)
df['y'] = df.apply(lambda row: -1 * np.deg2rad(row.starlink_alt) * np.sin(np.deg2rad(row.starlink_az)), axis=1)

#########################################################################################

# this is what i want but as lines group per group

#########################################################################################

cvs = ds.Canvas(plot_width=2000, plot_height=2000)
agg = cvs.points(df, 'y', 'x')
img = ds.tf.shade(agg, cmap=colorcet.fire, how='eq_hist')


#########################################################################################

#here i am stuck

#########################################################################################


for name in list_of_names:
    df2 = df_grouped.get_group(name)
    cvs = ds.Canvas(plot_width=2000, plot_height=2000)
    agg = cvs.line(df2, 'y', 'x')
    img = ds.tf.shade(agg, cmap=colorcet.fire, how='eq_hist')
    #plt.imshow(img)
plt.show()




Tags: ofnamedfnamesplotnpdsax
1条回答
网友
1楼 · 发布于 2024-05-23 19:09:10

要做到这一点,您有两个选择。一种是在使用^{}时将NaN行作为断点插入数据帧。您需要DataShader通过在每组后面插入一行nan来“拿起笔”。这不是最巧妙的,但这是目前推荐的解决方案

非常简单的、骇人的例子:

In [17]: df = pd.DataFrame({
    ...:     'name': list('AABBCCDD'),
    ...:     'x': np.arange(8),
    ...:     'y': np.arange(10, 18),
    ...: })

In [18]: df
Out[18]:
  name  x   y
0    A  0  10
1    A  1  11
2    B  2  12
3    B  3  13
4    C  4  14
5    C  5  15
6    D  6  16
7    D  7  17

此块在“名称”列上分组,然后将每组重新索引为比原始数据长一个元素:

In [20]: res = df.set_index('name').groupby('name').apply(
    ...:     lambda x: x.reset_index(drop=True).reindex(np.arange(len(x) + 1))
    ...: )

In [21]: res
Out[21]:
          x     y
name
A    0  0.0  10.0
     1  1.0  11.0
     2  NaN   NaN
B    0  2.0  12.0
     1  3.0  13.0
     2  NaN   NaN
C    0  4.0  14.0
     1  5.0  15.0
     2  NaN   NaN
D    0  6.0  16.0
     1  7.0  17.0
     2  NaN   NaN

您可以将此重新索引的数据帧插入datashader,以便在结果中具有多个断开连接的行

这是datashader repo上仍然存在的问题,包括其他示例和样板代码:https://github.com/holoviz/datashader/issues/257

其他选项包括重新构造数据以适应cvs.line的其他格式之一。从^{} docstring开始:


def line(self, source, x=None, y=None, agg=None, axis=0, geometry=None,
         antialias=False):
    Parameters
         
    source : pandas.DataFrame, dask.DataFrame, or xarray.DataArray/Dataset
        The input datasource.
    x, y : str or number or list or tuple or np.ndarray
        Specification of the x and y coordinates of each vertex
        * str or number: Column labels in source
        * list or tuple: List or tuple of column labels in source
        * np.ndarray: When axis=1, a literal array of the
          coordinates to be used for every row
    agg : Reduction, optional
        Reduction to compute. Default is ``any()``.
    axis : 0 or 1, default 0
        Axis in source to draw lines along
        * 0: Draw lines using data from the specified columns across
             all rows in source
        * 1: Draw one line per row in source using data from the
             specified columns

cvs.line docstring中还有许多其他示例。当axis=1时,您可以将数组作为x,y参数传递给多个列以用于形成行,或者您可以使用不规则数组值的数据帧

见本pull request adding the line options(评论中的h/t to@James-a-bednar),了解对其使用的讨论

相关问题 更多 >