Kosaraju算法通过迭代DFS查找完成时间

11 投票

5 回答

7093 浏览

提问于 2025-04-18 08:40

这是我为Kosaraju算法写的代码的第一部分。

###### reading the data #####
with open('data.txt') as req_file:
        ori_data = []
        for line in req_file:
            line = line.split()
            if line:
                line = [int(i) for i in line]
                ori_data.append(line)

###### forming the Grev ####
revscc_dic = {}
for temp in ori_data:
    if temp[1] not in revscc_dic:
        revscc_dic[temp[1]] = [temp[0]]
    else:
        revscc_dic[temp[1]].append(temp[0])

print revscc_dic        

######## finding the G#####
scc_dic = {}
for temp in ori_data:
    if temp[0] not in scc_dic:
        scc_dic[temp[0]] = [temp[1]]
    else:
        scc_dic[temp[0]].append(temp[1])

print scc_dic        

##### iterative dfs ####
path = []
for i in range(max(max(ori_data)),0,-1):
    start = i
    q=[start]
    while q:
        v=q.pop(0)
        if v not in path:
          path.append(v)
          q=revscc_dic[v]+q
print path

这段代码读取数据，并正确地形成了Grev和G。我写了一个用于迭代深度优先搜索（dfs）的代码。请问我该如何加入代码来找到结束时间呢？我知道用纸和笔来找结束时间的方式，但我不太明白如何在代码中实现结束时间的部分？我该怎么做呢？只有在这之后，我才能继续写我的下一部分代码。请帮帮我。提前谢谢你们。

data.txt文件的内容是：

请将其保存为data.txt。

数据结构深度优先搜索算法分析图论迭代算法 kosaraju算法结束时间

5 个回答

-1

首先，你需要明白什么是完成时间。在递归深度优先搜索（dfs）中，完成时间是指一个节点v的所有相邻节点[V]都处理完毕的时刻。为了记录这些信息，你需要一个额外的数据结构来存储所有相关的信息。

adj[][]  //graph
visited[]=NULL //array of visited node
finished[]=NULL //array of finished node
Stack st=new Stack  //normal stack 
Stack backtrack=new Stack //additional stack
function getFinishedTime(){
for(node i in adj){
     if (!vistied.contains[i]){
         st.push(i);
         visited.add(i)
         while(!st.isEmpty){
              int j=st.pop();
              int[] unvisitedChild= getUnvistedChild(j);
              if(unvisitedChild!=null){
                   for(int c in unvisitedChild){
                        st.push(c);
                        visited.add(c);
                    }
                    backtrack.push([j,unvisitedChild]); //you can store each entry as array with the first index as the parent node j, followed by all the unvisited child node.
              }
              else{ 
                   finished.add(j);
                   while(!backtrack.isEmpty&&finished.containsALL(backtrack.peek())) //all of the child node is finished, then we can set the parent node visited
                   {
                   parent=backtrack.pop()[0];
                   finished.add(parent);
                   }
              }
        }
    }
}

 function getUnvistedChild(int i){
     unvisitedChild[]=null
     for(int child in adj[i]){
        if(!visited.contains(child))
            unvisitedChild.add(child);
     }
     return unvisitedChild;
 }

而完成时间的顺序应该是 [5, 2, 8, 3, 6, 9, 1, 4, 7]

回答于 2025-04-18 由 Python大师

分享举报

下面是用Java写的递归和迭代的实现方式：

int time = 0;
public void dfsRecursive(Vertex vertex) {
        time += 1;
        vertex.setVisited(true);
        vertex.setDiscovered(time);
        for (String neighbour : vertex.getNeighbours()) {
            if (!vertices.get(neighbour).getVisited()) {
                dfsRecursive(vertices.get(neighbour));
            }
        }
        time += 1;
        vertex.setFinished(time);
    }

    public void dfsIterative(Vertex vertex) {
        Stack<Vertex> stack = new Stack<>();
        stack.push(vertex);
        while (!stack.isEmpty()) {
            Vertex current = stack.pop();
            if (!current.getVisited()) {
                time += 1;
                current.setVisited(true);
                current.setDiscovered(time);
                stack.push(current);
                List<String> currentsNeigbours = current.getNeighbours();
                for (int i = currentsNeigbours.size() - 1; i >= 0; i--) {
                    String currentNeigbour = currentsNeigbours.get(i);
                    Vertex neighBour = vertices.get(currentNeigbour);
                    if (!neighBour.getVisited())
                        stack.push(neighBour);
                }
            } else {
                if (current.getFinished() < 1) {
                    time += 1;
                    current.setFinished(time);
                }
            }
        }
    }

回答于 2025-04-18 由 Python大师

分享举报

我在使用Lawson版本的迭代深度优先搜索（DFS）时遇到了一些问题。这里是我自己版本的代码，它和递归版本的DFS有一一对应的关系。

n = len(graph)
time = 0
finish_times = [0] * (n + 1)
explored = [False] * (n + 1)

# Determine if every vertex connected to v
# has already been explored
def all_explored(G, v):
    if v in G:
        for w in G[v]:
            if not explored[w]:
                return False
    return True

# Loop through vertices in reverse order
for v in xrange(n, 0, -1):
    if not explored[v]:
        stack = [v]
        while stack:
            print(stack)
            v = stack[-1]
            explored[v] = True

            # If v still has outgoing edges to explore
            if not all_explored(graph_reversed, v):
                for w in graph_reversed[v]:

                    # Explore w before others attached to v
                    if not explored[w]:
                        stack.append(w)
                        break

            # We have explored vertices findable from v
            else:
                stack.pop()
                time += 1
                finish_times[v] = time

回答于 2025-04-18 由 Python大师

分享举报

迭代深度优先搜索（DFS）本身并不复杂，大家可以在维基百科上看到。不过，要计算每个节点的完成时间，就需要对算法做一些调整。我们只有在第二次遇到这个节点时，才会把它从栈中弹出。

下面是我的实现，我觉得这样能更清楚地展示发生了什么：

step = 0  # time counter

def dfs_visit(g, v):
    """Run iterative DFS from node V"""
    global step
    total = 0
    stack = [v]  # create stack with starting vertex
    while stack:  # while stack is not empty
        step += 1
        v = stack[-1]  # peek top of stack
        if v.color:  # if already seen
            v = stack.pop()  # done with this node, pop it from stack
            if v.color == 1:  # if GRAY, finish this node
                v.time_finish = step
                v.color = 2  # BLACK, done
        else:  # seen for first time
            v.color = 1  # GRAY: discovered
            v.time_discover = step
            total += 1
            for w in v.child:  # for all neighbor (v, w)
                if not w.color:  # if not seen
                    stack.append(w)
    return total

def dfs(g):
    """Run DFS on graph"""
    global step
    step = 0  # reset step counter
    for k, v in g.nodes.items():
        if not v.color:
            dfs_visit(g, v)

我遵循了CLR算法书的约定，使用节点的颜色来表示它在DFS搜索中的状态。我觉得这样比用一个单独的列表来跟踪节点状态更容易理解。

所有节点一开始都是白色。当在搜索中发现它时，就把它标记为灰色。当我们处理完这个节点后，就把它标记为黑色。

在while循环中，如果一个节点是白色，我们就把它保留在栈中，并把它的颜色改为灰色。如果它是灰色，我们就把它的颜色改为黑色，并设置它的完成时间。如果它是黑色，我们就直接忽略它。

栈中的一个节点可能是黑色的（即使在把它加到栈之前我们已经检查了颜色）。一个白色节点可以通过两个不同的邻居被添加到栈中两次。最终其中一个会变成黑色。当我们在栈中遇到第二次这个节点时，我们需要确保不去改变它已经设置好的完成时间。

这里还有一些额外的支持代码：

class Node(object):
    def __init__(self, name=None):
        self.name = name
        self.child = []  # children | adjacency list
        self.color = 0  # 0: white [unvisited], 1: gray [found], 2: black [finished]
        self.time_discover = None  # DFS
        self.time_finish = None  # DFS

class Graph(object):
    def __init__(self):
        self.nodes = defaultdict(Node)  # list of Nodes
        self.max_heap = []  # nodes in decreasing finish time for SCC

    def build_max_heap(self):
        """Build list of nodes in max heap using DFS finish time"""
        for k, v in self.nodes.items():
            self.max_heap.append((0-v.time_finish, v))  # invert finish time for max heap
        heapq.heapify(self.max_heap)

要在反向图上运行DFS，你可以在处理边文件时，为每个节点构建一个类似于子节点列表的父节点列表，并在dfs_visit()中使用父节点列表，而不是子节点列表。

为了在计算强连通分量（SCC）的最后部分按完成时间降序处理节点，你可以构建一个节点的最大堆，并在dfs_visit()中使用这个最大堆，而不是简单地使用子节点列表。

    while g.max_heap:
        v = heapq.heappop(g.max_heap)[1]
        if not v.color:
           size = dfs_visit(g, v)
           scc_size.append(size)

回答于 2025-04-18 由 Python大师

分享举报

在使用递归深度优先搜索（DFS）时，我们很容易知道一个节点什么时候“完成”了（也就是我们已经访问了它所有的子节点）。完成的时间可以在递归调用返回后计算出来。
但是在迭代深度优先搜索中，这就没那么简单了。因为我们现在是通过一个循环来处理队列，这样就失去了与函数调用相关的一些嵌套结构。更准确地说，我们不知道什么时候会回溯。 不幸的是，如果不在我们的节点栈中添加一些额外的信息，就无法知道何时回溯。

给你的深度优先搜索实现添加完成时间的最快方法如下：

##### iterative dfs (with finish times) ####
path = []
time = 0
finish_time_dic = {}
for i in range(max(max(ori_data)),0,-1):
    start = i
    q = [start]
    while q:
        v = q.pop(0)
        if v not in path:
            path.append(v)
            q = [v] + q
            for w in revscc_dic[v]:
                if w not in path: q = [w] + q
        else:
            if v not in finish_time_dic:
                finish_time_dic[v] = time
                time += 1
print path  
print finish_time_dic

这里的技巧是，当我们从栈中弹出v时，如果这是我们第一次看到它，那么我们就把它再放回栈中。这可以通过：q = [v] + q来实现。我们必须在推入它的邻居之前，先将v推入栈中（我们在推入v的邻居的for循环之前写代码来推入v）——否则这个技巧就不管用了。最终，我们会再次从栈中弹出v。在这一刻，v已经完成了！我们之前见过v，所以我们进入else分支，计算一个新的完成时间。

对于提供的图，finish_time_dic给出了正确的完成时间：

{1: 6, 2: 1, 3: 3, 4: 7, 5: 0, 6: 4, 7: 8, 8: 2, 9: 5}

注意，这个深度优先搜索算法（带有完成时间的修改）仍然具有 O(V+E) 复杂度，尽管我们将每个节点在图中推入栈两次。不过，还有更优雅的解决方案。我建议你阅读Magnus Lie Hetland的Python Algorithms: Mastering Basic Algorithms in the Python Language的第五章（ISBN: 1430232374, 9781430232377）。第5-6和5-7题（在第122页）正好描述了你的问题。作者回答了这些问题，并给出了问题的另一种解决方案。

问题：

5-6 在递归深度优先搜索中，回溯发生在你从某个递归调用返回时。但在迭代版本中，回溯去哪儿了？

5-7. 写一个非递归版本的深度优先搜索，能够确定完成时间。

答案：

5-6 在迭代版本中，回溯并没有被明确表示。它只是在你从栈中弹出所有“遍历后代”后隐式发生。

5-7 正如在练习5-6中解释的那样，迭代深度优先搜索中并没有回溯发生的特定点，所以我们不能像在递归版本中那样在某个特定位置设置完成时间。相反，我们需要在栈中添加一个标记。例如，我们可以添加形式为(u, v)的边，而不是将的邻居添加到栈中，在它们之前，我们会推入(u, None)，表示的回溯点。

回答于 2025-04-18 由 Python大师

分享举报

Kosaraju算法通过迭代DFS查找完成时间

5 个回答

撰写回答