如何使用networkx计算有向图的最低根的总距离

2024-04-26 09:59:12 发布

您现在位置:Python中文网/ 问答频道 /正文

如果你看这个DAG(有向无环图):

我想创建一个dict,该dict将从最低节点到所有其他节点的距离映射到渲染图中每个节点距离底部的x位置(高度)。 对于给定的图形,它将是:

distance_nodes_map: {
  0: {'base-zero', 'base-one'}, 
  1: {'low-b', 'low-a', 'low-c'}, 
  3: {'high-x', 'high-z', 'high-y'}, 
  2: {'mid-r', 'mid-q', 'mid-p'}, 
  4: {'super'}
}

我写了一个算法,对上面的图有效,但后来我测试了另一个图,它不再有效了。我尝试了一些算法和函数,比如最短路径或距离的子体,但我认为它们对计算距离没有帮助

我的算法不适用于此图:

https://gist.github.com/timaschew/3b08a07243fa6f43773014ef5e705c96

以下是要点,包括:

  • 一个python脚本,它读取YAML文件、依赖项/图形结构,并生成一个带有呈现的mermaid图形的HTML(我删除了以错误方式计算距离的算法)
  • 此处显示的两个图形都是YAML文件

Tags: 文件算法图形距离yamlbase高度节点
2条回答

您正在寻找一种绘制分层图的算法。有许多不同的算法,您应该选择最适合您需要的算法(例如,看看下面的论文A Technique for Drawing Directed Graphs,作者是Gansner等人。

其中许多算法已经在Graphviz(一个非常著名和强大的图形可视化软件)中实现。一旦安装了它,就可以非常简单地计算您要查找的结果(G是使用networkx.DiGraph构建的有向无环图):

from networkx.drawing.nx_agraph import graphviz_layout

def get_distance_nodes_map(G):
    pos = graphviz_layout(G, prog='dot')
    coor = sorted({y for k, (x, y) in pos.items()})
    kmap = dict(zip(coor, range(len(coor))))
    distance_nodes_map = {level: set() for level in kmap.values()}
    for k, (x, y) in pos.items():
        distance_nodes_map[kmap[y]].add(k)
    return distance_nodes_map

下面是几个使用您提供的数据的示例:

>>> from networkx import DiGraph
>>> from pprint import PrettyPrinter
>>> pp = PrettyPrinter()
>>> G1 = DiGraph()
>>> G1.add_edges_from([('super', 'high-x'), ('high-x', 'mid-p'),
...                    ('mid-p', 'low-b'), ('mid-p', 'low-c'),
...                    ('low-c', 'base-zero'), ('low-c', 'base-one'),
...                    ('high-y', 'mid-p'), ('high-y', 'base-zero'),
...                    ('high-z', 'base-one'), ('high-z', 'mid-r'),
...                    ('high-z', 'mid-q'), ('mid-q', 'low-a'),
...                    ('low-a', 'base-one')])
>>> pp.pprint(get_distance_nodes_map(G1))
{0: {'base-one', 'base-zero'},
 1: {'low-a', 'low-b', 'low-c'},
 2: {'mid-p', 'mid-r', 'mid-q'},
 3: {'high-y', 'high-x', 'high-z'},
 4: {'super'}}
>>> G2 = DiGraph()
>>> G2.add_edges_from([('n10', 'n11'), ('n11', 'n12'), ('n12', 'n13'),
...                    ('n13', 'n14'), ('n20', 'n14'), ('n20', 'n21'),
...                    ('n21', 'n22'), ('n22', 'n23'), ('n30', 'n23'),
...                    ('n30', 'n31'), ('n31', 'n32')])
>>> pp.pprint(get_distance_nodes_map(G2))
{0: {'n32'},
 1: {'n31', 'n23'},
 2: {'n30', 'n22'},
 3: {'n21', 'n14'},
 4: {'n13', 'n20'},
 5: {'n12'},
 6: {'n11'},
 7: {'n10'}}

未经测试的伪代码,因为我的午休时间快结束了:

您有一个多根树,其中有一个选定的主根

  1. 对于每个根,创建一个子图,其中包含该根的所有可访问节点

  2. 从主根(根a)开始,计算对应子图a中所有节点到根的距离/最短路径长度

  3. 查找与主子图至少共享一个节点的所有子图,并选择与主根距离最小的节点(节点x)所在的子图(子图B)

  4. 计算子图b中所有节点到根b的距离。添加距离d(节点x,根a)。减去距离d(节点x,根b)

  5. 创建子图A和B的并集。重复步骤3-5,直到没有根为止

  6. 减去最大距离&;反转符号,使主根具有最大的距离/顺序值

编辑:

我的伪代码有效(*)。我责备用户的错误。;-)

#!/usr/bin/env python
"""
https://stackoverflow.com/q/66584661/
"""
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx


def hierarchical_layout(graph):

    longest_path = nx.algorithms.dag.dag_longest_path(graph)
    principal_root = longest_path[0]

    roots = [node for node, degree in list(graph.in_degree) if degree==0]
    subgraphs = {root : create_subgraph(graph, root) for root in roots}

    # Starting with the principal root (root a), compute the
    # longest path length to the root for all nodes in the
    # corresponding subgraph A.
    node_to_level = single_source_longest_dag_path_length(subgraphs[principal_root], principal_root)

    explored = subgraphs[principal_root]
    del subgraphs[principal_root]

    while len(explored) < len(graph):

        # Find all subgraphs that share at least one node with the
        # principal subgraph, and select the subgraph (subgraph B) that
        # has the node (node x) with the smallest distance to the
        # principal root.
        minimum_cost = np.inf
        minimum_cost_node = None
        minimum_cost_root = None
        for root, subgraph in subgraphs.items():
            for node in subgraph.nodes:
                if node in node_to_level:
                    if node_to_level[node] < minimum_cost:
                        minimum_cost = node_to_level[node]
                        minimum_cost_node = node
                        minimum_cost_root = root

        assert minimum_cost_node, "Could not find a connected subgraph."

        # Compute the distance to the root b for all nodes in subgraph
        # B. Add the distance d(node x, root a). Subtract the distance
        # d(node x, root b).
        path_lengths = [len(path) for path in nx.all_simple_paths(subgraphs[minimum_cost_root], minimum_cost_root, minimum_cost_node)]
        offset = np.max(path_lengths) - 1
        for node, distance in single_source_longest_dag_path_length(subgraphs[minimum_cost_root], minimum_cost_root).items():
            if not node in node_to_level:
                node_to_level[node] = distance + minimum_cost - offset

        # Create the union of subgraph A and B.
        explored = nx.compose(explored, subgraphs[minimum_cost_root])
        del subgraphs[minimum_cost_root]

    return node_to_level


def create_subgraph(G, node):
    # https://stackoverflow.com/a/45678930/2912349
    nodes = nx.single_source_shortest_path(G,node).keys()
    return G.subgraph(nodes)


def single_source_longest_dag_path_length(graph, s):
    # from AlaskaJoslin's comment to https://stackoverflow.com/a/60978007/2912349
    dist = dict.fromkeys(graph.nodes, -float('inf'))
    dist[s] = 0
    topo_order = nx.topological_sort(graph)
    for n in topo_order:
        for s in graph.successors(n):
            if dist[s] < dist[n] + 1:
                dist[s] = dist[n] + 1
    return dist


if __name__ == '__main__':

    # edge_list = [
    #     ("n10", "n11"),
    #     ("n11", "n12"),
    #     ("n12", "n13"),
    #     ("n13", "n14"),
    #     ("n20", "n21"),
    #     ("n20", "n14"),
    #     ("n21", "n22"),
    #     ("n22", "n23"),
    #     ("n30", "n23"),
    #     ("n30", "n31"),
    #     ("n31", "n32"),
    # ]

    edge_list = [
        ("low-a", "base-one"),
        ("low-c", "base-zero"),
        ("low-c", "base-one"),
        ("mid-p", "low-b"),
        ("mid-p", "low-c"),
        ("mid-q", "low-a"),
        ("high-x", "mid-p"),
        ("high-y", "mid-p"),
        ("high-y", "base-zero"),
        ("high-z", "mid-q"),
        ("high-z", "mid-r"),
        ("high-z", "base-one"),
        ("super", "high-x"),
    ]

    graph = nx.DiGraph()
    graph.add_edges_from(edge_list)

    node_to_level = hierarchical_layout(graph)

    # reverse output format
    distance_nodes_map = dict()
    max_distance = np.max(list(node_to_level.values()))
    for node, distance in node_to_level.items():
        reversed_distance = max_distance - distance
        if reversed_distance in distance_nodes_map:
            distance_nodes_map[reversed_distance].add(node)
        else:
            distance_nodes_map[reversed_distance] = set([node])

    # print(distance_nodes_map)
    for ii, nodes in sorted(distance_nodes_map.items())[::-1]:
        print(f"{ii} : {nodes}")

收益率:

    # 4 : {'super'}
    # 3 : {'high-x', 'high-y', 'high-z'}
    # 2 : {'mid-p', 'mid-r', 'mid-q'}
    # 1 : {'low-a', 'low-b', 'low-c'}
    # 0 : {'base-one', 'base-zero'}

(*)“减去距离d(节点x,根b)”自然意味着节点x和根b之间的最长路径长度。很明显

相关问题 更多 >