行在pandas框架连接中丢失

a = pd.DataFrame({ 'year': [1995, 1995, 1995, 1995, 1996, 1996, 1996, 1996], 'team': ['Panthers', 'Panthers', 'Eagles', 'Eagles', 'Panthers', 'Panthers', 'Eagles', 'Eagles'], 'name': ['Joe', 'Betty', 'James', 'Sandra', 'Tyrone', 'Betty', 'James', 'Michael'], 'fans': [100, 200, 244, 277, 800, 900, 122, 300] }) b = pd.DataFrame({ 'year': [1995, 1995, 1995, 1995, 1996, 1996, 1996, 1996], 'team': ['Panthers', 'Panthers', 'Eagles', 'Eagles', 'Panthers', 'Panthers', 'Eagles', 'Eagles'], 'wins': [4, 2, 3, 5, 6, 7, 2, 4] }) aa = a.groupby(['year', 'team']).sum() bb = b.groupby(['year', 'team']).sum() aa.join(bb)

2条回答

网友

1楼 · 编辑于 2024-06-16 10:59:06

1）reset_index()只能使用一次

aa = a.groupby(['year', 'team']).sum()
bb = b.groupby(['year', 'team']).sum()

aa.join(bb).reset_index()

2）或者，不要使用as_index=False和pd.merge为aa和bb创建级别

aa = a.groupby(['year', 'team'], as_index=False).sum()
bb = b.groupby(['year', 'team'], as_index=False).sum()

pd.merge(aa, bb)

这两种方法，将给您相同的输出

    year    team        fans    wins
0   1995    Eagles       521    8
1   1995    Panthers     300    6
2   1996    Eagles       422    6
3   1996    Panthers    1700    13

网友

2楼 · 编辑于 2024-06-16 10:59:06

这个问题的解决方案是应用reset_index()通过操作“结束”组

因此，以下将产生正确的结果：

aa = a.groupby(['year', 'team']).sum().reset_index()
bb = b.groupby(['year', 'team']).sum().reset_index()

pd.merge(aa, bb)

相关问题更多 >

编程相关推荐

热门问题

热门文章