Webcrawler循环

2024-05-16 05:04:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我在我的网络爬虫中写了以下循环。在

几秒钟后就用完了。我不明白为什么。在

def crawlweb(seed):
    crawled = []
    tocrawl = [seed]
    page = tocrawl[0]
    while tocrawl:
        if page not in crawled:
            tocrawl = tocrawl[1:] + (get_links(get_page(page)))
            crawled.append(page)
    return crawled, tocrawl

Tags: in网络getifdefpagenotlinks
1条回答
网友
1楼 · 发布于 2024-05-16 05:04:27
def crawl_web(seed)
tocrawl = [seed]
crawled = []
while tocrawl:
    page = tocrawl.pop()
    if page not in crawled:
        union(tocrawl, get_all_links(get_page(page)))
        crawled.append(page)
return crawled

相关问题 更多 >