pandas只读取第一列的标题,其他返回“Unnamed”
我有一个很大的csv文件,里面大约有82836行和83列。第一列叫“Epoch”,我需要把它设为索引,并且声明为日期时间格式。当我尝试加载这个csv文件,并用header=0
把第一行设为表头时,它只读取了第一列的表头,其他列都显示为“Unnamed”。
我附上了这个csv文件的一小部分。
Epoch 1.12E+04 1.25E+04 1.41E+04 1.58E+04
2020-312T00:00:00.746 2.4660e-15 2.3305e-15 2.9498e-15 7.4974e-15
2020-312T00:00:01.746 1.4063e-14 1.3197e-14 6.5884e-15 5.2453e-15
2020-312T00:00:02.746 9.3894e-15 7.5422e-15 5.4330e-15 2.3551e-15
2020-312T00:00:03.746 5.2527e-15 4.3381e-15 6.5884e-15 5.2453e-15
2020-312T00:00:04.746 5.2527e-15 9.3456e-15 3.3400e-15 5.2453e-15
2020-312T00:00:05.746 6.9815e-15 9.3456e-15 1.1581e-14 6.3209e-15
2020-312T00:00:06.746 9.3894e-15 2.7787e-15 5.4330e-15 3.0432e-15
2020-312T00:00:07.746 8.1850e-15 5.6205e-15 2.5233e-15 2.6651e-15
2020-312T00:00:08.746 3.7247e-15 4.3381e-15 8.2865e-15 7.4974e-15
2020-312T00:00:09.746 5.2527e-15 1.3197e-14 8.2865e-15 6.3209e-15
2020-312T00:00:10.746 6.1547e-15 5.6205e-15 3.8158e-15 4.3281e-15
2020-312T00:00:11.746 6.9815e-15 7.5422e-15 4.3232e-15 5.2453e-15
2020-312T00:00:12.746 9.3894e-15 7.5422e-15 8.2865e-15 7.4974e-15
^ 这个csv文件大概是这样的。我尝试了以下方法;
df = pd.read_csv('localfilepath/filename.csv', header=0)
df.set_index('Epoch', inplace = True)
df
结果是这样的,索引已经设置为Epoch列。
Epoch 1.12E+04 Unnamed:2 Unnamed:3 Unnamed:4
2020-312T00:00:00.746 2.4660e-15 2.3305e-15 2.9498e-15 7.4974e-15
2020-312T00:00:01.746 1.4063e-14 1.3197e-14 6.5884e-15 5.2453e-15
2020-312T00:00:02.746 9.3894e-15 7.5422e-15 5.4330e-15 2.3551e-15
2020-312T00:00:03.746 5.2527e-15 4.3381e-15 6.5884e-15 5.2453e-15
2020-312T00:00:04.746 5.2527e-15 9.3456e-15 3.3400e-15 5.2453e-15
2020-312T00:00:05.746 6.9815e-15 9.3456e-15 1.1581e-14 6.3209e-15
2020-312T00:00:06.746 9.3894e-15 2.7787e-15 5.4330e-15 3.0432e-15
2020-312T00:00:07.746 8.1850e-15 5.6205e-15 2.5233e-15 2.6651e-15
2020-312T00:00:08.746 3.7247e-15 4.3381e-15 8.2865e-15 7.4974e-15
2020-312T00:00:09.746 5.2527e-15 1.3197e-14 8.2865e-15 6.3209e-15
2020-312T00:00:10.746 6.1547e-15 5.6205e-15 3.8158e-15 4.3281e-15
2020-312T00:00:11.746 6.9815e-15 7.5422e-15 4.3232e-15 5.2453e-15
2020-312T00:00:12.746 9.3894e-15 7.5422e-15 8.2865e-15 7.4974e-15
我该怎么做才能解决这个问题,并把Epoch列设置为日期时间索引呢?
谢谢大家的帮助!
1 个回答
0
我正在使用 pandas 1.3.5 版本。
我把你给的例子复制到了一个叫 example.tsv 的文件里。
这一行代码对我来说是有效的:
df = pd.read_table("data/example.tsv", sep='\s+').set_index('Epoch')
df
这会输出一个数据框:
1.12E+04 1.25E+04 1.41E+04 1.58E+04
Epoch
2020-312T00:00:00.746 2.466000e-15 2.330500e-15 2.949800e-15 7.497400e-15
2020-312T00:00:01.746 1.406300e-14 1.319700e-14 6.588400e-15 5.245300e-15
2020-312T00:00:02.746 9.389400e-15 7.542200e-15 5.433000e-15 2.355100e-15
2020-312T00:00:03.746 5.252700e-15 4.338100e-15 6.588400e-15 5.245300e-15
2020-312T00:00:04.746 5.252700e-15 9.345600e-15 3.340000e-15 5.245300e-15
2020-312T00:00:05.746 6.981500e-15 9.345600e-15 1.158100e-14 6.320900e-15
2020-312T00:00:06.746 9.389400e-15 2.778700e-15 5.433000e-15 3.043200e-15
2020-312T00:00:07.746 8.185000e-15 5.620500e-15 2.523300e-15 2.665100e-15
2020-312T00:00:08.746 3.724700e-15 4.338100e-15 8.286500e-15 7.497400e-15
2020-312T00:00:09.746 5.252700e-15 1.319700e-14 8.286500e-15 6.320900e-15
2020-312T00:00:10.746 6.154700e-15 5.620500e-15 3.815800e-15 4.328100e-15
2020-312T00:00:11.746 6.981500e-15 7.542200e-15 4.323200e-15 5.245300e-15
2020-312T00:00:12.746 9.389400e-15 7.542200e-15 8.286500e-15 7.497400e-15