我做了一个电子邮件/电话提取脚本。它工作得很好,但是当我试图用网站解析CSV进行提取时,我遇到了问题
我试过用熊猫,但比这更容易崩溃:
with open('input.csv', encoding='utf-8') as csv_file:
for row in csv_file:
elements = row.split(',')
website = elements[3]
emails = phones = ''
if website == 'N/A':
emails = phones = 'N/A'
else:
email_list = []
phone_list = []
email_list = extractUrl(website)
phone_list = phoneNumberExtract(website)
print('Emails found -> '+str(email_list))
print('Phones found -> '+str(phone_list))
print("")
if len(email_list) == 0:
emails = 'NO EMAILS'
if len(phone_list) == 0:
phones = 'NO PHONES'
for email in email_list:
emails += email + '/'
for phone in phone_list:
phones += phone + '/'
output_row = elements[0] + ',' + elements[1] + ',' + elements[2] + ',' + elements[3] + ',' + elements[4] + ',' + elements[5] + ',' + emails + ',' + phones + ',' + '\n'
with open('output.csv', mode="a", encoding='utf-8') as output_csv:
output_csv.write(output_row)
大约1000行之后,它崩溃并抛出如下内存错误:
Traceback (most recent call last):
File "script.py", line 299, in <module>
output_csv.write(output_row)
MemoryError
有人能解释一下我哪里出错了吗
你试过限制数量吗
相关问题 更多 >
编程相关推荐