无法在Python中读取大型CSV文件
我在用Python 2.7,想要读取一个CSV文件里的内容。我做了一个原始CSV文件的简化版,只保留了前10行数据。用下面的代码运行时,它的效果正是我想要的,我可以通过修改genfromtxt的"usecols"字段中的Z的索引,来读取CSV中特定范围的列。
import numpy as np
import array
Z = array.array('i', (i for i in range(0, 40)))
with open('data/training_edit.csv','r') as f:
data = np.genfromtxt(f, dtype=float, delimiter=',', names=True, usecols=(Z[0:32]))
print(data)
但是当我用这段代码去处理我的原始CSV文件(有25万行和33列)时,输出的结果却是这样,我不知道为什么:
Traceback (most recent call last):
File "/home/user/PycharmProjects/H-B2/Read.py", line 74, in <module>
data = np.genfromtxt(f, dtype=float, delimiter=',', names=True,usecols=(Z[0:32]))
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1667, in genfromtxt
raise ValueError(errmsg)
ValueError: Some errors were detected !
.
.
.
Line #249991 (got 1 columns instead of 32)
Line #249992 (got 1 columns instead of 32)
Line #249993 (got 1 columns instead of 32)
Line #249994 (got 1 columns instead of 32)
Line #249995 (got 1 columns instead of 32)
Line #249996 (got 1 columns instead of 32)
Line #249997 (got 1 columns instead of 32)
Line #249998 (got 1 columns instead of 32)
Line #249999 (got 1 columns instead of 32)
Line #250000 (got 1 columns instead of 32)
Process finished with exit code 1
(我加了省略号只是为了缩短真实的输出,但希望你能明白我的意思)
1 个回答
0
是的,我觉得你只需要在你的 usecols
中加上 range(0,32)
,像这样:
data = np.genfromtxt(f, dtype=float, delimiter=',',
names=True,usecols=range(0,32))
我刚刚自己弄明白这个。