如何对Sankey关系图中的节点进行排序(绘图)

2024-05-19 20:26:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我想实现一个可视化,显示一个状态到另一个状态之间的变化频率,这些状态用数字表示。 My data looks like this和is calldf_sankey

我在考虑一个Sankey图,它遵循documentation中的示例。所以我想要一列,状态A为I1,I2,I20和另一列,状态B为F1、F2、…、F20。然后,每对值之间的频率将表示为加权线as follows

但是,我无法根据状态数对列中的节点进行排序This is what I want to achieve.

这就是我尝试过的:

#Create Labels
source = pd.DataFrame(np.arange(1,21), columns = ['source'])['source'].apply(lambda x: 'I' + str(x))
target = pd.DataFrame(np.arange(1,21), columns = ['target'])['target'].apply(lambda x: 'F' + str(x))
labels = pd.concat([source, target], axis=0).reset_index(drop=True)

#X-node
x_node = np.concatenate((np.ones(int(len(source)))*0.1, np.ones(int(len(target)))), axis = None)

#Y-node
y_node = np.tile(np.linspace(0,100,len(source)),2)

#Create Dataframe
df_nodes = pd.DataFrame(data = {'label': labels, 'X': x_node, 'Y': y_node})

#绘图

fig = go.Figure(data=[go.Sankey(
    arrangement='snap',
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = df_nodes['label'],
      color = "blue",
      x = df_nodes['X'],
      y = df_nodes['Y']
    ),
    link = dict(
      source = df_sankey['State_A']-1, #Indices correspond to labels, eg A1, A2, A1, B1, ...
      target = df_sankey['State_B']+20-1,
      value = df_sankey['Freq']
  ))])

fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()

有什么想法吗


Tags: nodesourcetargetdataframedfdatalabelslen