基于分组接受数据帧中的顶行

2024-06-01 02:03:42 发布

您现在位置:Python中文网/ 问答频道 /正文

与这里的问题相关:Reordering pandas dataframe based on multiple column and sum of one column

使用sort列时,如何接受此数据框中的前两个国家: 在这种情况下,排名前两位的国家将是澳大利亚和阿富汗

  Country_FAO type   mean_area        sort
5    Australia  car  12141000.0  18910501.0
4    Australia  car   6475695.0  18910501.0
6    Australia  bus    293806.0  18910501.0
0  Afghanistan  car   2029000.0   2141000.0
1  Afghanistan  car    112000.0   2141000.0
2      Algeria  bus    827000.0    829351.0
3      Algeria  bus      2351.0    829351.0

--编辑:

我还想保留type列。在这种情况下,解决方案应如下所示:

Country_FAO type   mean_area        sort
5    Australia  car  12141000.0  18910501.0
4    Australia  car   6475695.0  18910501.0
6    Australia  bus    293806.0  18910501.0
0  Afghanistan  car   2029000.0   2141000.0
1  Afghanistan  car    112000.0   2141000.0

Tags: type情况columnarea国家meansortcar
1条回答
网友
1楼 · 发布于 2024-06-01 02:03:42

更新:

In [166]: df.loc[df.Country_FAO.isin(df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index)]
Out[166]:
   Country_FAO type   mean_area        sort
5    Australia  car  12141000.0  18910501.0
4    Australia  car   6475695.0  18910501.0
6    Australia  bus    293806.0  18910501.0
0  Afghanistan  car   2029000.0   2141000.0
1  Afghanistan  car    112000.0   2141000.0

我会这样做:

In [153]: df.groupby('Country_FAO').sum()
Out[153]:
              mean_area
Country_FAO
Afghanistan   2141000.0
Algeria        829351.0
Australia    18910501.0

In [154]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area')
Out[154]:
              mean_area
Country_FAO
Australia    18910501.0
Afghanistan   2141000.0

In [155]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index
Out[155]: Index(['Australia', 'Afghanistan'], dtype='object', name='Country_FAO')

此外,您可能需要重置索引:

In [156]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').reset_index()
Out[156]:
   Country_FAO   mean_area
0    Australia  18910501.0
1  Afghanistan   2141000.0

相关问题 更多 >