2024-04-19 13:15:27 发布
网友
我有一个相关矩阵,但指定为成对,如:
cm = pd.DataFrame({'name1': ['A', 'A', 'B'], 'name2': ['B', 'C', 'C'], 'corr': [0.1, 0.2, 0.3]}) cm name1 name2 corr 0 A B 0.1 1 A C 0.2 2 B C 0.3
将其转换为numpy 2d阵列相关矩阵的最简单方法是什么
A B C A 1.0 0.1 0.2 B 0.1 1.0 0.3 C 0.2 0.3 1.0
假设最后一列以适当的方式排序,我们可以使用以下代码
import pandas as pd import numpy as np # define data frame data = pd.DataFrame({ 'name1': ['A', 'A', 'B'], 'name2': ['B', 'C', 'C'], 'correlation': [0.1, 0.2, 0.3]}) # get correlation column and dimension correlation = data['correlation'].values dimension = correlation.shape[0] # define empty matrix to fill and unit matrix matrix_upper_triangular = np.zeros((dimension, dimension)) # fill upper triangular matrix with one half at diagonal counter = 0 for (row, column), element in np.ndenumerate(matrix_upper_triangular): # half of diagonal terms if row == column: matrix_upper_triangular[row, column] = 0.5 # upper triangular values elif row < column: matrix_upper_triangular[row, column] = correlation[counter] counter = counter + 1 else: pass # add upper triangular + lower triangular matrix correlation_matrix = matrix_upper_triangular correlation_matrix += matrix_upper_triangular.transpose()
不确定pure numpy,因为您正在处理一个数据帧。下面是一个纯熊猫解决方案:
pure numpy
s = cm.pivot(*cm) ret = s.add(s.T, fill_value=0).fillna(1)
输出:
Extra:对于反向(ret如上所述)
ret
(ret.where(np.triu(np.ones(ret.shape, dtype=bool),1)) .stack() .reset_index(name='corr') )
level_0 level_1 corr 0 A B 0.1 1 A C 0.2 2 B C 0.3
一种方法是使用networkX构建图形,将corr列设置为边weight,并使用^{}获取adjacency matrix:
networkX
corr
weight
import networkx as nx G = nx.from_pandas_edgelist(cm.rename(columns={'corr':'weight'}), source='name1', target='name2', edge_attr ='weight') G.edges(data=True) # EdgeDataView([('A', 'B', {'weight': 0.1}), ('A', 'C', {'weight': 0.2}), # ('B', 'C', {'weight': 0.3})]) adj = nx.to_pandas_adjacency(G) # sets the diagonal to 1 (node can't be connected to itself) adj[:] = adj.values + np.eye(adj.shape[0])
print(adj) A B C A 1.0 0.1 0.2 B 0.1 1.0 0.3 C 0.2 0.3 1.0
假设最后一列以适当的方式排序,我们可以使用以下代码
不确定
pure numpy
,因为您正在处理一个数据帧。下面是一个纯熊猫解决方案:输出:
Extra:对于反向(
ret
如上所述)输出:
一种方法是使用} 获取adjacency matrix:
networkX
构建图形,将corr
列设置为边weight
,并使用^{相关问题 更多 >
编程相关推荐