numpy或pandas邻接矩阵的iggraph图

2024-06-02 05:47:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个邻接矩阵存储为pandas.DataFrame

node_names = ['A', 'B', 'C']
a = pd.DataFrame([[1,2,3],[3,1,1],[4,0,2]],
    index=node_names, columns=node_names)
a_numpy = a.as_matrix()

我想从pandasnumpy邻接矩阵创建一个igraph.Graph。在理想情况下,节点将按预期命名。

这可能吗?The tutorial似乎对这个问题保持沉默。


Tags: columnsnumpynodedataframepandasindex节点names
2条回答

严格地说,adjacency matrix是布尔型的,1表示存在连接,0表示不存在连接。由于a_numpy矩阵中的许多值都是>;1,我将假定它们对应于图中的边权重。

import igraph

# get the row, col indices of the non-zero elements in your adjacency matrix
conn_indices = np.where(a_numpy)

# get the weights corresponding to these indices
weights = a_numpy[conn_indices]

# a sequence of (i, j) tuples, each corresponding to an edge from i -> j
edges = zip(*conn_indices)

# initialize the graph from the edge sequence
G = igraph.Graph(edges=edges, directed=True)

# assign node names and weights to be attributes of the vertices and edges
# respectively
G.vs['label'] = node_names
G.es['weight'] = weights

# I will also assign the weights to the 'width' attribute of the edges. this
# means that igraph.plot will set the line thicknesses according to the edge
# weights
G.es['width'] = weights

# plot the graph, just for fun
igraph.plot(G, layout="rt", labels=True, margin=80)

enter image description here

在igraph中,您可以使用^{}从邻接矩阵创建图,而不必使用zip。当使用加权邻接矩阵并将其存储在np.arraypd.DataFrame中时,需要注意一些事情。

  • igraph.Graph.Adjacency不能将np.array作为参数,但是使用^{}很容易解决这个问题。

  • 邻接矩阵中的整数被解释为节点间的边数而不是权值,用邻接作为布尔值求解。

操作方法示例:

import igraph
import pandas as pd

node_names = ['A', 'B', 'C']
a = pd.DataFrame([[1,2,3],[3,1,1],[4,0,2]], index=node_names, columns=node_names)

# Get the values as np.array, it's more convenenient.
A = a.values

# Create graph, A.astype(bool).tolist() or (A / A).tolist() can also be used.
g = igraph.Graph.Adjacency((A > 0).tolist())

# Add edge weights and node labels.
g.es['weight'] = A[A.nonzero()]
g.vs['label'] = node_names  # or a.index/a.columns

您可以使用^{}通过以下方式重建相邻数据帧:

df_from_g = pd.DataFrame(g.get_adjacency(attribute='weight').data,
                         columns=g.vs['label'], index=g.vs['label'])
(df_from_g == a).all().all()  # --> True

相关问题 更多 >