查找连通分量 Networkx
我需要找出一个无向加权图中的连接节点。我在这里查了一些建议,但没有人回答和我问题相关的内容。这些节点对之间也会和邻居连接,每对连接时会花费一些秒数。我想找出连接的组件,以及相同的评论连接了多少次,以及它们连接了多长时间(时间)。
例如:
Node Node time
A B 34
A B 56
A C 09
A D 5464
A C 456
C B 36
C A 345
B C 346
所以总的来说,A B C
连接了两次。
Nodes connected time
[A B C] 1 34+09+36 = 79
[A B C] 1 56+345+346 = 747
期望的输出是:
Nodes connected time
[A B C] 2 826
And
Node connected time
[A B] 2 90
[B C] 2 382
[A C] 2 354
代码:
import networkx as nx
import numpy as np
from collections import defaultdict
count = defaultdict(int)
time = defaultdict(float)
data = np.loadtxt('USC_Test.txt')
for line in data:
edge_list = [(line[0], line[1])]
G= nx.Graph()
G.add_edges_from(edge_list)
components = nx.connected_components(G)
count['components'] += 1
time['components'] += float(line[2])
print components, count['components'], time['components']
输入:
5454 5070 2755.0
5070 4391 2935.0
1158 305 1.0
5045 3140 48767.0
4921 3140 58405.0
5372 2684 460.0
1885 1158 351.0
1349 1174 6375.0
1980 1174 650.0
1980 1349 650.0
4821 2684 469.0
4821 937 459.0
2684 937 318.0
1980 606 390.0
1349 606 750.0
1174 606 750.0
5045 3545 8133.0
4921 3545 8133.0
3545 3140 8133.0
5045 4243 14863.0
4921 4243 14863.0
4243 3545 8013.0
4243 3140 14863.0
4821 4376 5471.0
4376 937 136.0
2613 968 435.0
5372 937 83.0
错误的输出
我得到的输出是错误的。
Last_node_pair total_count_of_line total_time of Entire input data
而我应该得到:
[5045 3140 4921] [number_of_times_same_components_connected] [total_time_components_connected]
1 个回答
4
这里有几个问题:
- 你在每次循环时都在重新创建图形,这样你图里就只会有一条边。
- 你用的是字面意思的字符串“components”,而不是用变量components作为索引,所以你只是在结果字典里保存和显示了那个单一的值。
- 你只在最后打印一次结果。在那里,components变量恰好是图中的最后一个组件(因为它是最后一个被赋值的循环变量),你打印出的总组件数量和时间,其实是所有组件的总数量和时间,这也是因为第二个问题。
这里有个应该能工作的方案。出于懒惰,我对数据进行了两次遍历。
import networkx as nx
import numpy as np
from collections import defaultdict
count = defaultdict(int)
time = defaultdict(float)
data = np.loadtxt('USC_Test.txt')
G = nx.Graph()
for line in data:
a,b,time = line
G.add_edge(a, b)
results = defaultdict(lambda: list([0, 0.0]))
components = nx.connected_components(G)
component_map = { }
component_stats = defaultdict(lambda: list([0,0.0]))
edge_stats = defaultdict(lambda: list([0,0.0]))
for nodes in components:
for node in nodes:
component_map[int(node)] = tuple(nodes)
for a,b,time in data:
component_stats[component_map[a]][0] += 1
component_stats[component_map[a]][1] += time
if len(component_map[a]) > 2:
edge_stats[(a,b)][0] += 1
edge_stats[(a,b)][1] += time
for nodes,(count,time) in component_stats.iteritems():
print sorted([ int(n) for n in nodes ]), count, time
print
for nodes,(count,time) in edge_stats.iteritems():
print sorted([ int(n) for n in nodes ]), count, time
用你的输入,这样会产生以下输出:
[606, 1174, 1349, 1980] 6 9565.0
[305, 1158, 1885] 2 352.0
[968, 2613] 1 435.0
[937, 2684, 4376, 4821, 5372] 7 7396.0
[4391, 5070, 5454] 2 5690.0
[3140, 3545, 4243, 4921, 5045] 9 184173.0
[1349, 1980] 1 650.0
[937, 4376] 1 136.0
[606, 1980] 1 390.0
[3140, 4921] 1 58405.0
[937, 5372] 1 83.0
[606, 1349] 1 750.0
[4391, 5070] 1 2935.0
[3545, 4921] 1 8133.0
[1158, 1885] 1 351.0
[3140, 3545] 1 8133.0
[2684, 4821] 1 469.0
[2684, 5372] 1 460.0
[937, 2684] 1 318.0
[1174, 1980] 1 650.0
[3140, 5045] 1 48767.0
[5070, 5454] 1 2755.0
[4376, 4821] 1 5471.0
[606, 1174] 1 750.0
[3545, 5045] 1 8133.0
[4243, 4921] 1 14863.0
[3140, 4243] 1 14863.0
[4243, 5045] 1 14863.0
[937, 4821] 1 459.0
[3545, 4243] 1 8013.0
[1174, 1349] 1 6375.0
[305, 1158] 1 1.0
希望这能帮到你!