使用标记作为Pandas数据帧的标题在Pandas中透视XML数据

data name 0 Aruba Country or Area 1 Population, total Item 2 1960 Year 3 54211 Value 4 Aruba Country or Area 5 Population, total Item 6 1961 Year 7 55438 Value 8 Aruba Country or Area 9 Population, total Item 10 1962 Year 11 56225 Value 12 Aruba Country or Area 13 Population, total Item 14 1963 Year 15 56695 Value 16 Aruba Country or Area 17 Population, total Item 18 1964 Year 19 57032 Value

2条回答

网友

1楼 · 编辑于 2024-05-14 18:12:11

枢轴的另一种方式：

df['idx'] = df.name.eq('Country or Area').cumsum()
df.pivot(index='idx', columns='name', values='data')

输出：

name Country or Area               Item  Value  Year
idx                                                 
1              Aruba  Population, total  54211  1960
2              Aruba  Population, total  55438  1961
3              Aruba  Population, total  56225  1962
4              Aruba  Population, total  56695  1963
5              Aruba  Population, total  57032  1964

网友

2楼 · 编辑于 2024-05-14 18:12:11

IIUC，您可以在name和cumcount上尝试groupby，然后是unstack：

df.assign(k=df.groupby('name').cumcount()).set_index(['k','name']).unstack()

                data                                
name Country or Area               Item  Value  Year
k                                                   
0              Aruba  Population, total  54211  1960
1              Aruba  Population, total  55438  1961
2              Aruba  Population, total  56225  1962
3              Aruba  Population, total  56695  1963
4              Aruba  Population, total  57032  1964

详情： ^{}

df.groupby('name').cumcount()

这将按名称和Numbers each item in each group from 0 to the length of that group - 1分组，并使用^{}将一个新列k分配给数据帧。然后使用set_index()wee将名称和k列设置为索引，以便获得：

print(df.assign(k=df.groupby('name').cumcount()).set_index(['k','name']))
                                data
k name                              
0 Country or Area              Aruba
  Item             Population, total
  Year                          1960
  Value                        54211
1 Country or Area              Aruba
  Item             Population, total
  Year                          1961
  Value                        55438
2 Country or Area              Aruba
  Item             Population, total
  Year                          1962
  Value                        56225
.......
.....

使用这些数据，我们使用^{}，这有助于“透视一级（必要的层次结构）索引标签，返回一个具有新级别列标签的数据帧，该列标签的最内层由透视索引标签组成”，因此这会根据我们的需要将索引的最后一级（默认情况下）转换为列

相关问题更多 >

编程相关推荐

热门问题

热门文章