<p>Python在IO上发布GIL。如果大部分时间都花在处理rest请求上,则可以使用线程来加快处理速度:</p>
<pre><code>try:
from gevent.pool import Pool # $ pip install gevent
import gevent.monkey; gevent.monkey.patch_all() # patch stdlib
except ImportError: # fallback on using threads
from multiprocessing.dummy import Pool
import urllib2
def process_line(url):
try:
return urllib2.urlopen(url).read(), None
except EnvironmentError as e:
return None, e
with open('input.csv', 'rb') as file, open('output.txt', 'wb') as outfile:
pool = Pool(20) # use 20 concurrent connections
for result, error in pool.imap_unordered(process_line, file):
if error is None:
outfile.write(result)
</code></pre>
<p>如果输入/输出顺序应该相同,则可以使用<code>imap</code>而不是{<cd2>}。在</p>
<p>如果您的程序是CPU限制的,您可以使用创建多个进程的<code>multiprocessing.Pool()</code>。在</p>
<p>另请参见<a href="https://stackoverflow.com/q/1212716/4279">Python Interpreter blocks Multithreaded DNS requests?</a></p>
<p><a href="https://stackoverflow.com/a/9874484/4279">This answer shows how to create a thread pool manually using threading + Queue modules</a>。在</p>