龙卷风：AsyncHttpClient.fetch从迭代器？

1条回答

网友

1楼 · 发布于 2024-04-19 04:48:26

我会用一个队列和多个worker来完成这个任务，在https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py上进行

import tornado.queues
from tornado import gen
from tornado.httpclient import AsyncHTTPClient
from tornado.ioloop import IOLoop

NUM_WORKERS = 10
QUEUE_SIZE = 100
q = tornado.queues.Queue(QUEUE_SIZE)
AsyncHTTPClient.configure(None, max_clients=NUM_WORKERS)
http_client = AsyncHTTPClient()

@gen.coroutine
def worker():
    while True:
        url = yield q.get()
        try:
            response = yield http_client.fetch(url)
            print('got response from', url)
        except Exception:
            print('failed to fetch', url)
        finally:
            q.task_done()

@gen.coroutine
def main():
    for i in range(NUM_WORKERS):
        IOLoop.current().spawn_callback(worker)
    with open("urls.txt") as f:
        for line in f:
            url = line.strip()
            # When the queue fills up, stop here to wait instead
            # of reading more from the file.
            yield q.put(url)
    yield q.join()

if __name__ == '__main__':
    IOLoop.current().run_sync(main)

相关问题更多 >

编程相关推荐

热门问题

热门文章

龙卷风：AsyncHttpClient.fetch从迭代器？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >