使用重采样计算两周的平均计数

2024-03-29 07:12:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用df.resample来计算来自特定CSV文件的传入卷的2周平均值,因此,对于每2周的跨度,绘图应该是一条平线。到目前为止,每日计数是正确的工作,我想我采取的DatetimeIndex,并试图重新采样在2周的时间间隔从最近的日期回到数据集的结尾。当我尝试的时候

open_dt = pd.to_datetime(dsort['Date Opened']).dt.date open_dt = open_dt.reset_index().sort_values('Date Opened').set_index('Date Opened').groupby('Date Opened').nunique() roll_avg = open_dt.resample('2W').mean() 我得到以下错误:

Only valid with DatetimeIndex, TimeDeltaIndex or PeriodIndex, but got instance of 'Index' 

我认为通过重置索引并将其设置为datetime字段,可以解决问题,但事实似乎并非如此。我还尝试初始化另一个只拉入原始文件的变量,但遇到了相同的问题。这是一个工作的副本,我的脚本与破碎辊œ平均包括在内

def data_process():#sorts by domain and team
data_merge = data_extract()
domains  = data_merge.groupby('PWx Domain')
for domain in domains.groups.items():
    dsort = (data_merge.loc[domain[1]])
    dsort.to_csv('output\\'+str(domain[0])+'.csv')
    open_dt = pd.to_datetime(dsort['Date Opened']).dt.date
    open_dt = open_dt.reset_index().sort_values('Date Opened').set_index('Date Opened').groupby('Date Opened').nunique()
    d_avg = open_dt.mean().round(0).item()
    roll_avg = open_dt.resample('2W').mean()
    print(roll_avg)
    fig = plt.figure()
    fig.suptitle(domain[0]+' Avg='+str(d_avg), fontsize=14)
    ax = plt.plot(open_dt,color='b', marker='o', linestyle='-') 
    ax = plt.plot(roll_avg, color = 'r', linestyle = '--') 
    fig.savefig('output\\'+domain[0]+'_Overall.png')
    plt.close()

这是正在读取的文件的头(数据合并)

       Client #                       Solution     Solution Family  \
0     81983  Ambulatory EHR ASP  Physician Practice
1     17235  Ambulatory EHR ASP  Physician Practice
2     17235  Ambulatory EHR ASP  Physician Practice
3     17235     Practice Management  Physician Practice
4     17235     Practice Management  Physician Practice

                      Team       SR #      Date Opened PWx Domain
0    PWx Mill Response ASP  416700000  6/20/2017 19:27   CPHYB_PR
1              Core T1 PWx  416700000  6/20/2017 18:33        NaN
2              Core T1 PWx  416700000  6/20/2017 18:33   CPHYB_PR
3  Claim Generation T3 PWx  416680000  6/19/2017 15:09        NaN
4  Claim Generation T3 PWx  416680000  6/19/2017 15:09   CPHYB_PR

Tags: datadateindexdomaindtpltopenavg