我有一个进程循环访问IP地址列表并返回有关它们的一些信息。simple for循环工作得很好,我的问题是由于Python的全局解释器锁(GIL)而大规模地运行这个循环。在
我的目标是让这个函数并行运行,并充分利用我的4个核心。这样,当我运行100K这些,它不会花我24小时通过一个正常的for循环。在
在阅读了其他人的答案之后,特别是这个,How do I parallelize a simple Python loop?,我决定使用joblib。当我运行10个记录通过它(上面的例子),它花了10分钟运行。这听起来不太对劲。我知道有些事情我做错了或者不理解。非常感谢任何帮助!在
import pandas as pd
import numpy as np
import os as os
from ipwhois import IPWhois
from joblib import Parallel, delayed
import multiprocessing
num_core = multiprocessing.cpu_count()
iplookup = ['174.192.22.197',\
'70.197.71.201',\
'174.195.146.248',\
'70.197.15.130',\
'174.208.14.133',\
'174.238.132.139',\
'174.204.16.10',\
'104.132.11.82',\
'24.1.202.86',\
'216.4.58.18']
正常的for循环,工作正常!在
^{pr2}$函数传递给joblib在所有核心上运行!在
def run_ip_process(iplookuparray):
asn=[]
asnid=[]
asncountry=[]
asndesc=[]
asnemail = []
asnaddress = []
asncity = []
asnstate = []
asnzip = []
asndesc2 = []
ipaddr=[]
b=1
totstolookup=len(iplookuparray)
for i in iplookuparray:
i = str(i)
print("Running #{} out of {}".format(b,totstolookup))
try:
obj=IPWhois(i,timeout=15)
result=obj.lookup_whois()
asn.append(result['asn'])
asnid.append(result['asn_cidr'])
asncountry.append(result['asn_country_code'])
asndesc.append(result['asn_description'])
try:
asnemail.append(result['nets'][0]['emails'])
asnaddress.append(result['nets'][0]['address'])
asncity.append(result['nets'][0]['city'])
asnstate.append(result['nets'][0]['state'])
asnzip.append(result['nets'][0]['postal_code'])
asndesc2.append(result['nets'][0]['description'])
ipaddr.append(i)
except:
asnemail.append(0)
asnaddress.append(0)
asncity.append(0)
asnstate.append(0)
asnzip.append(0)
asndesc2.append(0)
ipaddr.append(i)
except:
pass
b+=1
ipdataframe = pd.DataFrame({'ipaddress':ipaddr,
'asn': asn,
'asnid':asnid,
'asncountry':asncountry,
'asndesc': asndesc,
'emailcontact': asnemail,
'address':asnaddress,
'city':asncity,
'state': asnstate,
'zip': asnzip,
'ipdescrip':asndesc2})
return ipdataframe
通过joblib使用所有核心运行进程
Parallel(n_jobs=num_core)(delayed(run_ip_process)(iplookuparray) for i in iplookup)
目前没有回答
相关问题 更多 >
编程相关推荐