从参考日期算起的最后6个月

2024-04-20 14:30:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要将参考日期的最后6个月考虑在内,对“qtd”列求和

prod    date       qtd  sum
proda   2018-01-01  2    2
proda   2018-02-01  2    4
proda   2018-04-01  1    5
proda   2018-05-01  4    9
proda   2018-06-01  2    11
proda   2018-07-01  1    11

我需要弄清楚如何计算“sum”列

请注意,我的数据帧上并不总是有每个月,例如,我没有三月。 给定一个参考日期(日期),我需要计算6个月前的数据,并对“qtd”列求和

谢谢


Tags: 数据dateprodsumqtdproda
1条回答
网友
1楼 · 发布于 2024-04-20 14:30:34

cumsum()函数将为您提供给定列的累积和。来自努比

df[‘sum’] = df[‘qtd’].cumsum()

嗯。如果只想提取切片和calc cumsum(),可以使用:

start_date = '2018-01-01'
end_date = '2018-05-01'
between = (df['date'] >= start_date) & (df['date'] <= end_date)
df2 = df[between]
df2['sum'] = df2['qtd'].cumsum()

df2

    prod    date        qtd sum
0   proda   2018-01-01  2   2
1   proda   2018-02-01  2   4
2   proda   2018-04-01  1   5
3   proda   2018-05-01  4   9

或者,如果只想在特定日期之间计算并将其添加到数据框中,则可以使用:

start_date = '2/1/18'
end_date = '6/1/18'

def total(start, end, df):
    sum_col = []
    for i in range(df.shape[0]): # Loop for all lines
        if df['date'][i] < start:
  # If before start date, NA (you could change to 0 too)
            sum_col.append('NaN')
        elif df['date'][i] == start: # start to sum
            sum_col.append(df['qtd'][I])
  #sum between your start and end dates
        elif (df['date'][i] > start) and (df['date'][i] <= end): 
            sum_col.append(df['qtd'][i]+sum_col[i-1])
  # after end date, it just adds NAs. You can change to repeat the last total
        elif df['date'][i] > end:
            sum_col.append('NaN')
    return sum_col

df['sum'] = total(start_date, end_date, df)

df

输出:

    prod    date    qtd sum
0   proda   1/1/18  2   NaN
1   proda   2/1/18  2   2
2   proda   4/1/18  1   3
3   proda   5/1/18  4   7
4   proda   6/1/18  2   9
5   proda   7/1/18  1   NaN

希望这有帮助

相关问题 更多 >