Python:dataframe根据列中整数的最后三位重新排列行

2024-04-24 15:14:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧:

YearMonth   Total Cost
2015009     $11,209,041 
2015010     $20,581,043 
2015011     $37,079,415 
2015012     $36,831,335 
2016008     $57,428,630 
2016009     $66,754,405 
2016010     $45,021,707 
2016011     $34,783,970 
2016012     $66,215,044 

YearMonth是int64列。YearMonth中的值(如2015009)表示2015年9月。我想对行重新排序,这样如果最后3位数字相同,那么我希望行按年份排序时显示在彼此的正上方。你知道吗

下面是我想要的输出:

YearMonth   Total Cost
2015009     $11,209,041 
2016009     $66,754,405     
2015010     $20,581,043 
2016010     $45,021,707    
2015011     $37,079,415 
2016011     $34,783,970   
2015012     $36,831,335 
2016012     $66,215,044
2016008     $57,428,630

我搜索了谷歌,试图找到如何做到这一点,但没有结果。你知道吗


Tags: 数据排序数字total年份costint64yearmonth
2条回答

其中一种方法是将int列转换为字符串,并使用带有索引的字符串访问。你知道吗

df.assign(sortkey=df.YearMonth.astype(str).str[-3:])\
  .sort_values('sortkey')\
  .drop('sortkey', axis=1)

输出:

   YearMonth   Total Cost
4    2016008  $57,428,630
0    2015009  $11,209,041
5    2016009  $66,754,405
1    2015010  $20,581,043
6    2016010  $45,021,707
2    2015011  $37,079,415
7    2016011  $34,783,970
3    2015012  $36,831,335
8    2016012  $66,215,044
df['YearMonth'] = pd.to_datetime(df['YearMonth'],format = '%Y0%m')
df['Year'] = df['YearMonth'].dt.year
df['Month'] = df['YearMonth'].dt.month
df.sort_values(['Month','Year'])

        YearMonth   Total   Year    Month
8   2016-08-01  $57,428,630 2016    8
0   2015-09-01  $11,209,041 2015    9
1   2016-09-01  $66,754,405 2016    9
2   2015-10-01  $20,581,043 2015    10
3   2016-10-01  $45,021,707 2016    10
4   2015-11-01  $37,079,415 2015    11
5   2016-11-01  $34,783,970 2016    11
6   2015-12-01  $36,831,335 2015    12
7   2016-12-01  $66,215,044 2016    12

一种方法。可能有一种更快的方法,只需较少的步骤,而不需要将YearMonth转换为datetime,但是如果您有日期,那么使用它就更有意义了。你知道吗

相关问题 更多 >