使用python优化向CSV文件（~300GB）添加列

import pandas as pd row = ['times1','times2'] for df1 in pd.read_csv('C:/SET/parti_no_diff.CSV',skipinitialspace=True, usecols=row, chunksize=10**7): df1['time_difference'] = (df1['times2'].astype('datetime64[s]')-df1['times1'].astype('datetime64[s]')).abs() df1.to_csv('E:/SET/parti_with_diff_seconds.csv',mode='a')

1条回答

网友

1楼 · 发布于 2024-06-16 10:20:50

老实说，Python内置的读写文本文件的功能是最佳的。一次读入一行到列表中，添加额外的列，然后将其附加到输出文本文件中。你知道吗

一次读入一行，根据需要进行修改，然后将其附加到输出文件中。会比你想象的快。您可以使用tqdm之类的工具来监视进度。你知道吗

比如：

import csv
from tqdm import tqdm
with open('myfile.txt', newline='') as f:
     reader = csv.reader(f)
     for row in tqdm(reader):
          row.append('new_column')
          with open('output.csv', 'a') as outfile:
               outfile.write(row)

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python优化向CSV文件（~300GB）添加列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >