如何找到一个数据帧的不同子部分的最大值和总和,并将其写入一个新的数据帧?

2024-04-25 09:10:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧df,如下所示:

Event           DateTime      MF(cfs)     OF(cfs)     MV(f3)      OV(f3)
Event 01    4/6/2016 20:15   0.8610723   2.105263   258.32169   631.5789
Event 01    4/6/2016 20:30   0.8596087  2.383901    257.88261   715.1703
Event 01    4/6/2016 20:45   0.8749821  2.092879    262.49463   627.8637
Event 02    4/8/2016 13:15   1.088487   1.98452     326.5461    595.356
Event 02    4/8/2016 13:30   1.083027   1.781734    324.9081    534.5202
Event 02    4/8/2016 13:45   1.056032   1.851393    316.8096    555.4179
Event 03    4/10/2016 18:30  0.9576297  2.187306    287.28891   656.1918
Event 03    4/10/2016 18:45  0.9872433  1.756966    296.17299   527.0898
Event 03    4/10/2016 19:00  1.014974   2.419505    304.4922    725.8515
Event 04    4/21/2016 15:30  0.755228   1.958204    226.5684    587.4612
Event 04    4/21/2016 15:45  0.7661297  2.027864    229.83891   608.3592
Event 04    4/21/2016 16:00  0.7862475  2.089783    235.87425   626.9349

Q1)如何获得每个事件的MF(cfs)和of(cfs)的最大值,以及每个事件的MV(f3)列和OV(f3)列的总和到新的数据帧中?你知道吗

此外,我想有相应的日期时间在最大值发生。 我该怎么做?你知道吗

期望输出:

        DateTime Peak MF(cfs)   Peak MF(cfs)   DateTime Peak OF(cfs)   Peak OF(cfs)  Total MVol(f3)   Total OV(f3)
Event
Event 1     8/15/2016 15:35           -                 -                -                  -
Event 2     8/15/2016 10:05           -                 -                -                  -
Event 3     8/15/2016 10:00           -                 -                -                  -
Event 4     8/15/2016 9:55            -                 -                -                  -
Event 5     8/15/2016 4:10            -                 -                -                  -     

Tags: of数据eventdfdatetime事件totalov
1条回答
网友
1楼 · 发布于 2024-04-25 09:10:24

您需要使用带有^{}的自定义函数groupby

df.DateTime = pd.to_datetime(df.DateTime)
df = df.reset_index()

def f(x):

    a =  x.loc[x['MF(cfs)'].idxmax(), 'DateTime']
    b =  x['MF(cfs)'].max()
    c =  x.loc[x['OF(cfs)'].idxmax(), 'DateTime']
    d =  x['OF(cfs)'].max()
    e =  x['MV(f3)'].sum()
    f =  x['OV(f3)'].sum()
    idx = ['DateTime Peak MF(cfs)','Peak MF(cfs)',
           'DateTime Peak OF(cfs)','Peak OF(cfs)','Total MVol(f3)', 'Total OV(f3)']
    return pd.Series([a,b,c,d,e,f], index=idx)

df1 = df.groupby('Event').apply(f).reset_index()
print (df1)
      Event DateTime Peak MF(cfs)  Peak MF(cfs) DateTime Peak OF(cfs)  \
0  Event 01   2016-04-06 20:45:00      0.874982   2016-04-06 20:30:00   
1  Event 02   2016-04-08 13:15:00      1.088487   2016-04-08 13:15:00   
2  Event 03   2016-04-10 19:00:00      1.014974   2016-04-10 19:00:00   
3  Event 04   2016-04-21 16:00:00      0.786247   2016-04-21 16:00:00   

   Peak OF(cfs)  Total MVol(f3)  Total OV(f3)  
0      2.383901       778.69893     1974.6129  
1      1.984520       968.26380     1685.2941  
2      2.419505       887.95410     1909.1331  
3      2.089783       692.28156     1822.7553  

相关问题 更多 >