如何查看已完成或剩余的map_async任务数量?
我在使用iPython的并行处理功能来进行一个大规模的映射操作。在等待这个操作完成的时候,我想给用户显示一下,有多少个任务已经完成,多少个正在运行,还有多少个任务还没开始。请问我该怎么获取这些信息呢?
我所做的事情是,创建一个使用本地引擎的配置,并启动两个工作进程。在命令行中:
$ ipython profile create --parallel --profile=local
$ ipcluster start --n=2 --profile=local
这是我写的客户端Python脚本:
#!/usr/bin/env python
def meat(i):
import numpy as np
import time
import sys
seconds = np.random.randint(2, 15)
time.sleep(seconds)
return seconds
import time
from IPython.parallel import Client
c = Client(profile='local')
dview = c[:]
ar = dview.map_async(meat, range(4))
elapsed = 0
while True:
print 'After %d s: %d running' % (elapsed, len(c.outstanding))
if ar.ready():
break
time.sleep(1)
elapsed += 1
print ar.get()
下面是这个脚本的示例输出:
After 0 s: 2 running
After 1 s: 2 running
After 2 s: 2 running
After 3 s: 2 running
After 4 s: 2 running
After 5 s: 2 running
After 6 s: 2 running
After 7 s: 2 running
After 8 s: 2 running
After 9 s: 2 running
After 10 s: 2 running
After 11 s: 2 running
After 12 s: 2 running
After 13 s: 2 running
After 14 s: 1 running
After 15 s: 1 running
After 16 s: 1 running
After 17 s: 1 running
After 18 s: 1 running
After 19 s: 1 running
After 20 s: 1 running
After 21 s: 1 running
After 22 s: 1 running
After 23 s: 1 running
[9, 14, 10, 3]
从输出中可以看到,我可以获取当前正在运行的任务数量,但无法知道已经完成的任务数量(或者还有多少任务未完成)。我该如何知道有多少个map_async
的任务已经完成呢?
1 个回答
3
AsyncResult 有一个叫 msg_ids
的属性。未完成的任务是这个属性和 rc.outstanding 的交集,而已完成的任务则是它们的差集:
msgset = set(ar.msg_ids)
completed = msgset.difference(rc.outstanding)
pending = msgset.intersection(rc.outstanding)