在数据帧中使用Datetime列与多个Datetime列进行最接近的匹配

2024-04-26 22:09:56 发布

您现在位置:Python中文网/ 问答频道 /正文

使用数据帧中的日期时间列与多个日期时间列进行比较以获得最接近的匹配

df1 = pd.DataFrame({'Datetime_1' : ['2/4/2015 2:00:00','3/5/2015 3:00:00','4/3/2015 4:00:00'],
                    'Datetime_2' : ['2/3/2015 2:00:00','3/3/2015 3:00:00','4/3/2015 4:10:00'],
                    'Datetime_3' : ['2/4/2015 2:00:10','3/5/2015 3:20:00','4/3/2015 4:10:10']})

    Datetime_1          Datetime_2          Datetime_3         Amount
0   2/4/2015 2:00:00    2/3/2015 2:00:00    2/4/2015 2:00:10    100
1   3/5/2015 3:00:00    3/3/2015 3:00:00    3/5/2015 3:20:00    700
2   4/3/2015 4:00:00    4/3/2015 4:10:00    4/3/2015 4:10:10    1000

df2 = pd.DataFrame({'Datetime_1' : ['2/4/2015 2:00:05','3/5/2015 3:11:00','4/3/2015 4:00:01']})

       Datetime_1       Values
0   2/4/2015 2:00:05    10
1   3/5/2015 3:11:00    70
2   4/3/2015 4:10:01    100

预期产量

 Datetime_1              Nearest_Match_df1     Values   Amount    MatchColumn 
 2/4/2015 2:00:05        2/4/2015 2:00:00      10       100       Datetime_1
 3/5/2015 3:11:00        3/5/2015 3:20:00      70       700       Datetime_3
 4/3/2015 4:10:01        4/3/2015 4:10:00      100      1000      Datetime_2

Tags: 数据dataframedatetimematch时间amountpddf1
1条回答
网友
1楼 · 发布于 2024-04-26 22:09:56

您可以在修改df1之后使用^{}来获取单个列中的所有日期。为此,需要set_index列的数量,然后stack数据帧,然后reset_indexrename列以匹配预期的输出,并用执行合并的日期复制列。你知道吗

df_res = pd.merge_asof( df2,
                        df1.set_index('Amount').stack().reset_index(name='Nearest_Match_df1')
                           .rename(columns={'level_1': 'MatchColumn'})
                           .assign(Datetime_1=lambda x: x.Nearest_Match_df1)
                           .sort_values(by=['Datetime_1']),
                        on = ['Datetime_1'], direction='nearest')

print (df_res)
           Datetime_1  values  Amount MatchColumn   Nearest_Match_df1
0 2015-02-04 02:00:05      10     100  Datetime_1 2015-02-04 02:00:00
1 2015-03-05 03:11:00      70     700  Datetime_3 2015-03-05 03:20:00
2 2015-04-03 04:00:01     100    1000  Datetime_1 2015-04-03 04:00:00

注意:所有列必须是Datetime类型

相关问题 更多 >