在Python中连接多个日期列和列值

2024-06-16 12:01:24 发布

您现在位置:Python中文网/ 问答频道 /正文

Actual .CSV datafile.

Date    MTM_B7_1    Date    MTM_B7_11   Date    MTM_B7_12
03/01/11 AM 2084    04/01/11 AM -8166   04/01/11 AM -8332
04/01/11 AM -9066   05/01/11 AM 28613   05/01/11 AM -8750
05/01/11 AM 103607  06/01/11 AM 35605   06/01/11 AM -21307
10/01/11 AM 68538   07/01/11 AM 55608   07/01/11 AM 151609

一个可重复的例子:

^{pr2}$

我有一个以上格式的.csv文件,我想连接所有的日期列,并根据它们各自的日期索引保留MTM列值,然后去掉其余的日期列,使所有MTM都有一个日期索引。


Tags: 文件csvdate格式am例子actualb7
1条回答
网友
1楼 · 发布于 2024-06-16 12:01:24

考虑将dataframe按Date/MTM对拆分到dataframe列表,然后使用reduce()链接合并外部联接:

from functools import reduce
import pandas as pd

df = pd.DataFrame({'Date': pd.date_range('01/03/11',periods=8, format=' %d/%m/%y'),
                   'MTM1': [2, 3, 4, 5, 6, 7, 8, 9],
                   'Date1': pd.date_range('01/07/11',periods=8, format=' %d/%m/%y'),
                   'MTM2': [4, 3, 5, -7, 6, -9, 8, 9]
                  })

# FOR POSTED TWO PAIRING DATA
dfList = [df[['Date', 'MTM1']],
          df[['Date1','MTM2']].rename(columns={'Date1':'Date'})]

# FOR MANY PAIRINGS (ASSUMING ORDER AND MATCHES INTACT)
dfList = [df[[d, m]].rename(columns={d:'Date'})\
                        for d, m in zip([c for c in df.columns if 'Date' in c],
                                        [c for c in df.columns if 'MTM' in c])]

df = reduce(lambda left, right: pd.merge(left, right, on=['Date'], how='outer'), dfList).set_index('Date')
print(df)

#             MTM1  MTM2
# Date                  
# 2011-01-03   2.0   NaN
# 2011-01-04   3.0   NaN
# 2011-01-05   4.0   NaN
# 2011-01-06   5.0   NaN
# 2011-01-07   6.0   4.0
# 2011-01-08   7.0   3.0
# 2011-01-09   8.0   5.0
# 2011-01-10   9.0  -7.0
# 2011-01-11   NaN   6.0
# 2011-01-12   NaN  -9.0
# 2011-01-13   NaN   8.0
# 2011-01-14   NaN   9.0

相关问题 更多 >