逐行阅读大型文本文件仍然占用我的全部内存

with open(dumpLocation, "r") as f: for line in f: # Read line, convert to dictionary and assign it to 'c' c = json.loads(f.readline()) for n in files: if n.lower() in c["title"].lower(): try: # Collect data timestamp = str(c["retrieved_on"]) sr_id = c["subreddit_id"] score = str(c["score"]) ups = str(c["ups"]) downs = str(c["downs"]) title = ('"' + c["title"] + '"') # Append data to file files[n].write(timestamp + "," + sr_id + "," + score + "," + ups + "," + downs + "," + title + "," + "\n") found += 1 except: numberOfErrors += 1 errors[comments] = sys.exc_info()[0] comments += 1 # Updates user print("Comments scanned: " + str(comments) + "\nFound: " + str(found) + "\n")

2条回答

网友

1楼 · 编辑于 2024-05-14 14:02:12

我找到了内存泄漏的地方。在这一行中，我在每行之后打印到控制台：

 print("Comments scanned: " + str(comments) + "\nFound: " + str(found) + "\n")

打印2亿次，你的计算机一定会用尽内存，试图一次将所有内容保存在控制台中。删除它，它工作得很好：）

网友

2楼 · 编辑于 2024-05-14 14:02:12

您的内存泄漏：

except:
    numberOfErrors += 1
    errors[comments] = sys.exc_info()[0]

对于大量的输入行，错误的数量也可能是巨大的，特别是如果您的算法中有一些错误的话。你知道吗

普通的except是有害的，因为它隐藏了代码中的所有错误，甚至语法错误。您应该只处理您期望在实际数据上发生的特定异常类型，并使try except block尽可能窄。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章