如何修复不平衡的列多索引?

2024-05-23 22:01:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图读入几个结构不好的csv文件:

[empty], A, A, B, B
time   , X, Y, X, Y
0.0    , 0, 0, 0, 0
1.0    , 2, 5, 7, 0
...    , ., ., ., .

使用带有pandas.read_csv参数的header=[0,1]可以很好地访问值:

>>> df = pd.read_csv('file.csv', header=[0,1]'
>>> df.A.X
0 0
1 2
...

但是时间头上方的空字段会导致一个丑陋的Unnamed: 0_level_0级别:

>>> df.columns
MultiIndex(levels=[['Unnamed: 0_level_0', 'A', 'B'], ...

有没有办法解决这个问题,这样我就可以再次使用df.Time访问时间数据了?你知道吗

编辑

这是真实数据集的一个片段:

,,Bone,Bone,Bone
,,Skeleton1_Hip,Skeleton1_Hip,Skeleton1_Hip
,,"1","1","1"
,,Rotation,Rotation,Rotation
Frame,Time,X,Y,Z
0,0.000000,0.009332,0.999247,0.021044
1,0.008333,0.009572,0.999217,0.020468
3,0.016667,0.009871,0.999183,0.019797

(另请参见:https://gist.github.com/fhaust/25ba612f99420d366f0597b15dbf43e7了解更完整的示例)

阅读方式:

pd.read_csv(file, skiprows=2, header=[0,1,3,4], index_col=[1])

我并不真正关心Frame列,因为它是用行索引隐式给出的。你知道吗


Tags: csv数据dfreadtime时间levelfile
1条回答
网友
1楼 · 发布于 2024-05-23 22:01:43

为convert first column to index添加参数index_col

import pandas as pd

temp=u""",A,A,B,B
time,X,Y,X,Y
0.0,0,0,0,0
1.0,2,5,7,0"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), header=[0,1], index_col=[0])

print (df)
      A     B   
time  X  Y  X  Y
0.0   0  0  0  0
1.0   2  5  7  0

或重命名列:

df = df.rename(columns={'Unnamed: 0_level_0':'val'})
print (df)
   val  A     B   
  time  X  Y  X  Y
0  0.0  0  0  0  0
1  1.0  2  5  7  0

相关问题 更多 >