有向图:连接所有路径的最近节点

2024-06-01 02:22:34 发布

您现在位置:Python中文网/ 问答频道 /正文

一些背景

我正在分析一个函数的控制流图,它基本上是将传入数据映射到传出数据。大多数街区都是这样的:

if (input_variable == SPECIFIC_CONSTANT) {
    output_variable = TRUE;
}
else {
    output_variable = FALSE;
}

这类代码的典型控制流图如下所示

^{pr2}$

Picture of the graphe

其中,34的执行受input_variable的值限制,但是5是独立的。在

问题

给定一个有向图和一个起始节点,如何找到最近的节点,使起始节点的任何路径都通过这个节点?在

示例:给定this graph如何找到从2开始的6或从{}开始的{}?

我可以反转一个最低的公共祖先算法吗?它是否有效?像

for each node in graph:
    ancestors = node.get_all_ancestors()
    lca = find_lowest_common_ancestor(ancestors)
    junction_node[lca] = node

get_junction_point(node):
    return junction_node[node]

我的编程语言是Python,我刚刚发现了NetworkX,但是任何算法都会很受欢迎。我不习惯图论,我想我错过了基本的词汇表来找到我要找的东西。在

谢谢你的帮助!在


Tags: 数据函数算法nodeinputoutputget节点
3条回答

虽然不是最有效的解决方案,但以下是一些应该让您开始的方法:

执行DFS,然后计算所有路径(每个路径中存在的节点)的交集。然后,在这些节点中,找到每个路径中最靠近开始的节点:

>>> paths
[]
>>> def dfs(G, s, path):
...     if s not in G or not G[s]:
...             paths.append(path)
...     else:
...             for t in G[s]:
...                     dfs({k:[_t for _t in v if _t!=t] for k,v in G.items()}, t, path+[t])
... 
>>> dfs(G, 2, [])
>>> paths
[[3, 4, 6, 7, 12], [3, 4, 6, 8, 9, 10, 12], [3, 4, 6, 8, 9, 12], [3, 4, 6, 8, 11, 12], [3, 5, 6, 7, 12], [3, 5, 6, 8, 9, 10, 12], [3, 5, 6, 8, 9, 12], [3, 5, 6, 8, 11, 12], [4, 6, 7, 12], [4, 6, 8, 9, 10, 12], [4, 6, 8, 9, 12], [4, 6, 8, 11, 12]]
>>> for path in paths: print(path)
... 
[3, 4, 6, 7, 12]
[3, 4, 6, 8, 9, 10, 12]
[3, 4, 6, 8, 9, 12]
[3, 4, 6, 8, 11, 12]
[3, 5, 6, 7, 12]
[3, 5, 6, 8, 9, 10, 12]
[3, 5, 6, 8, 9, 12]
[3, 5, 6, 8, 11, 12]
[4, 6, 7, 12]
[4, 6, 8, 9, 10, 12]
[4, 6, 8, 9, 12]
[4, 6, 8, 11, 12]
>>> nodes = [set(L) for L in paths]
>>> commons = functools.reduce(set.intersection, nodes)
>>> commons
{12, 6}
>>> min(commons, key=lambda v: max(L.index(v) for L in paths))
6

现在,请注意6在某些路径中是如何在索引2处显示的,在其他一些路径中是如何在索引1处显示的。如果有一个节点(比如x)出现在索引1,路径6出现在索引2,索引2出现在索引1,那么,这将是一个平局,这个算法会任意破坏它。根据您的需要,您可能需要定义如何更好地处理这种情况

你可以这样做:

对于每个节点,查找其所有祖先和后代的列表。如果大小(祖先)+大小(后代)+1等于网络大小,则它是候选。现在,找到一个至少有一个祖先和最大数量后代的节点。在

祖先和后代的名单很容易计算出来。如果你不确定,请告诉我,我会扩展我的答案。在

读了你提出的所有解决方案,我想出了一个主意。我给我的第一个节点赋值为1。递归地,所有的子代都会得到相等的分数。反过来,他们把这笔钱向下分配。如果一个孩子总共得到1(起始量),那么它就是“连接点”。这是我的实现(公开征求意见!!)。在

我希望BFS结构可以限制访问的节点数量。在

class Node(object):
  """
    Object representing a node in a graph where we search the junction node that
    concentrates all paths starting from a start node.

    ``reference``: Attaches the real object that contains the relationships.
    ``initial_value``: Initial amount for the node. Typically 1 for the start
      node, 0 for the others.
    ``path``: Set of already traversed nodes that reached the node. Used to
      prune circular dependencies.
  """
  def __init__(self, reference, initial_value, path=set()):
    self.reference = reference
    # See dispatch() for explaination
    self.value_can_dispatch = self.value_has_received = initial_value
    self.path = path

  def receive(self, value):
    """
      Give ``value`` to the node. If the node received 1 (or more for security)
      in total, it will return True. Else it returns False.
    """
    self.value_has_received += value
    self.value_can_dispatch += value
    if self.value_has_received >= 1.:
      return True
    return False

  def dispatch(self, children):
    """
      Dispatch the value received to the children.
      Returns a filtered list of ``children`` where children involved in a
      circular dependency are removed.
      If one child signals that it has received a total of 1, the function will
      stop and return this one child.
    """
    # Filter successors that are in the path used to access this node so to cut
    # out cycles
    true_successors = [child for child in children if child not in self.path]
    # Cut the received value into equal pieces
    amount = self.value_can_dispatch/len(true_successors)
    # We transmit only the value received after the last time it was dispatched
    # because paths may lead to the same node at different iterations (path
    # through one node may be longer than through another) and thus the same
    # node can be asked to dispatch to its children more than once.
    # The total amount of received value is kept in value_has_received because
    # the node may receive the complete amount in several steps. Thus, part of
    # its value may have been dispatched before we notice that the node received
    # the total amount of 1.
    self.value_can_dispatch = Fraction(0)
    for child in true_successors:
      # If child signaled that he received 1, then raise the winner
      if child.receive(amount):
        return child
    return set(true_successors)

  def touch(self, other_node):
    """
      "Touches" a node with another, notifying that the node is reachable
      through another path than the known ones.
      It adds the elements of the new path as ancestors of the node.
    """
    self.path |= other_node.path | {other_node}

  def make_child(self, reference):
    """
      Creates a child of the node, pointing to reference. The child receives
      its path from the current node.
    """
    # This is were the algorithm can go mad. If child is accessed through two
    # paths, the algorithm will only protect recursion into the first
    # path. If the successors recurse into the second path, we will not detect
    # it. => We should update the child's path when a second path reaches it.
    return self.__class__(reference, Fraction(0), self.path | {self})

  def __repr__(self):
    return "<{} {}>".format(self.__class__.__name__, self.reference)

def find_junction_node(first_reference, get_uid, get_successors, max_iterations=100):
  """
    Find the junction node of all paths starting from ``first_reference`` in
    a directed graph. ``get_uid`` is a function accepting a reference to a node
    in your graph and returning a unique identifier for this reference.
    ``get_successors`` is a function accepting a reference to a node in your
    graph. It should return a list of references to all its the children nodes.
    It may return None if the node has no child.
    ``max_iterations`` limits the number of pass the algorithm use to find the
    junction node. If reached, the funciton raises a RuntimeError.

    Returns ``(jp, ln)`` where ``jp`` is a reference to a node in your graph
    which is the junction node and ``ln`` the list of nodes in the subgraph
    between the start node and the junction node.
  """
  # Mapping to already created nodes
  nodes = {}
  # Initialise first node with an amount of 1
  node = Node(first_reference, Fraction(1, 1))
  nodes[get_uid(first_reference)] = node
  # Initialise first iteration of DFS
  successors = set()
  successors.add(node)
  # Max iteration provides security as I'm not sure the algorithm cannot loop
  for i in range(max_iterations):
    next_successors = set()
    # Process one level of nodes
    for node in successors:
      # Find successors in data graph
      sub_references = get_successors(node.reference)
      # This happens when we reach the end of the graph, node has no children
      if sub_references is None:
        continue
      # Make a list of Node that are children of node
      children = set()
      for reference in sub_references:
        uid = get_uid(reference)
        # Does it exist?
        child = nodes.get(uid, None)
        if not child:
          child = node.make_child(reference)
          nodes[uid] = child
        else:
          child.touch(node)
        children.add(child)
      # Dispatch the value of node equally between its children
      result = node.dispatch(children)
      #print("Children of {}: {!r}".format(node, result)) # DEBUG
      # If one child received a total of 1 from its parents, it is common to
      # all paths
      if isinstance(result, Node):
        return result.reference,  [node.reference for node in result.path]
      # Else, add the filtered list of children to the set of node to process
      # in the next level
      else:
        next_successors |= result
    successors = next_successors
    # Reached end of graph by all paths without finding a junction point
    if len(successors) == 0:
      return None
  raise RuntimeError("Max iteration reached")

相关问题 更多 >