从多列python中查找

2024-04-23 07:15:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据帧df:

       ID      03/01/2021     17/01/2021  20/02/2021  03/07/2022   17/07/2022  

  0   ABC             5               6          10           5            5            
  1   BCD             6               7           3           6           10              
  2   DEF            10               4           5           2            8              

我想查找具有相同月份和年份的列,并将值的总和保存在新的数据框中

产出将是:

date       sum 
01/2021     38     
02/2021     18
07/2022     36
02/2023     19

我非常感谢你的帮助


Tags: 数据iddfdatedefsumabc年份
3条回答

尝试转置数据帧,然后按yearmonth分组

df = df.drop('ID', axis=1).T
df.index = pd.to_datetime(df.index, format='%d/%m/%Y')
df2 = df.groupby([df.index.year, df.index.month]).sum(1).sum(1)
df2.index = [ f"{m}/{y}" for (y,m) in df2.index ]
df2 = pd.DataFrame({'Date': df2.index, 'sum': df2.values})

df2:

Date    sum
1/2021  38
2/2021  18
7/2022  36
2/2023  19

首先对日期列值求和,然后按groupby计算和

df1 = df.drop('ID', axis=1).sum(0)
df1.index = pd.to_datetime(df1.index, format='%d/%m/%Y')
df2 = df1.groupby([df1.index.year, df1.index.month]).sum(1)
df2.index = [ f"{m}/{y}" for (y,m) in df2.index ]
df2 = pd.DataFrame(df2, columns=['sum']).rename_axis('Date')

^{}^{}^{}一起使用:

In [419]: x = df.stack().reset_index().query('level_1 != "ID"')
In [426]: x.level_1 = pd.to_datetime(x.level_1, format='%d/%m/%Y')

In [431]: y = x.groupby([x.level_1.dt.year, x.level_1.dt.month])[0].sum()
In [445]: d = [str(j) + '/' + str(i) for i,j in y.index]

In [448]: ans = pd.DataFrame({'date': d, 'sum': y.values})

In [449]: ans
Out[449]: 
     date  sum
0  1/2021   38
1  2/2021   18
2  7/2022   36
3  2/2023   19

试试这个:

# df = your original dataframe 

mth_year_sums = dict()

for date, col in df.iteritems():
    mth_year = date.split("/", maxsplit=1)[-1]
    count = mth_year_sums.get(mth_year, 0)
    mth_year_sums[mth_year] = count + sum(col)

result = pd.DataFrame({'sum': mth_year_sums})

# rename index name to 'date'
result.index.name = 'date'
print(result)

相关问题 更多 >