获取Pandas中的滞后数据

2024-03-28 11:11:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从数据集中得到滞后数据。数据集是每月一次的,如下所示:

           Final Profits
JCCreateDate    
2016-04-30  31163371.59
2016-05-31  27512300.34
...
2019-02-28  16800693.82
2019-03-31  5384227.13

现在从上面的数据集中,我选择了一个数据窗口(过去12个月的数据),我想从中减去3、6、9和12个月

我创建了如下窗口数据集:

df_all = pd.read_csv('dataset.csv')
df = pd.read_csv('window_dataset.csv')
data_start, data_end = pd.to_datetime(df.first_valid_index()), pd.to_datetime(df.last_valid_index())
dr = pd.date_range(data_start, data_end, freq='M')

现在对于日期范围dr我想减去月份,让我们假设我从dr减去3个月,并尝试从df_all检索数据

df_all.loc[dr - pd.DateOffset(months=3)]

这给了我以下输出

            Final Profits
2018-01-30  NaN
2018-02-28  9240766.46
2018-03-30  NaN
2018-04-30  13250515.05
2018-05-31  12539224.15
2018-06-30  17778326.04
2018-07-31  19345671.02
2018-08-30  NaN
2018-09-30  14815607.14
2018-10-31  28979099.74
2018-11-28  NaN
2018-12-31  12395273.24

可以看出,我得到了一些NaN,因为像一月,三月这样的月份有31天,减法是在寻找一个月中错误的一天。怎么处理


Tags: csvto数据dfreaddatananall
1条回答
网友
1楼 · 发布于 2024-03-28 11:11:22

我不是100%的你要找的,但我怀疑使用轮班制

# set up dataframe
index = pd.date_range(start='2016-04-30', end='2019-03-31', freq='M' )
df = pd.DataFrame(np.random.randint(5000000, 50000000, 36), index=index, columns=['Final Profits'])

# create three columns shifting and subtracing from 'Final_Profits'
df['3mos'] = df['Final Profits'] - df['Final Profits'].shift(3)
df['6mos'] = df['Final Profits'] - df['Final Profits'].shift(6)
df['9mos'] = df['Final Profits'] - df['Final Profits'].shift(9)

print(df.head(12))

         Final Profits        3mos        6mos        9mos
2016-04-30       45197972         NaN         NaN         NaN
2016-05-31        5029292         NaN         NaN         NaN
2016-06-30       20310120         NaN         NaN         NaN
2016-07-31       10514197 -34683775.0         NaN         NaN
2016-08-31       31219405  26190113.0         NaN         NaN
2016-09-30       21504727   1194607.0         NaN         NaN
2016-10-31       19234437   8720240.0 -25963535.0         NaN
2016-11-30       18881711 -12337694.0  13852419.0         NaN
2016-12-31       27237712   5732985.0   6927592.0         NaN
2017-01-31       21692788   2458351.0  11178591.0 -23505184.0
2017-02-28        7869701 -11012010.0 -23349704.0   2840409.0
2017-03-31       20943248  -6294464.0   -561479.0    633128.0

相关问题 更多 >