我试图使用df.resample
来计算来自特定CSV文件的传入卷的2周平均值,因此,对于每2周的跨度,绘图应该是一条平线。到目前为止,每日计数是正确的工作,我想我采取的DatetimeIndex,并试图重新采样在2周的时间间隔从最近的日期回到数据集的结尾。当我尝试的时候
open_dt = pd.to_datetime(dsort['Date Opened']).dt.date
open_dt = open_dt.reset_index().sort_values('Date Opened').set_index('Date Opened').groupby('Date Opened').nunique()
roll_avg = open_dt.resample('2W').mean()
我得到以下错误:
Only valid with DatetimeIndex, TimeDeltaIndex or PeriodIndex, but got instance of 'Index'
我认为通过重置索引并将其设置为datetime字段,可以解决问题,但事实似乎并非如此。我还尝试初始化另一个只拉入原始文件的变量,但遇到了相同的问题。这是一个工作的副本,我的脚本与破碎辊œ平均包括在内
def data_process():#sorts by domain and team
data_merge = data_extract()
domains = data_merge.groupby('PWx Domain')
for domain in domains.groups.items():
dsort = (data_merge.loc[domain[1]])
dsort.to_csv('output\\'+str(domain[0])+'.csv')
open_dt = pd.to_datetime(dsort['Date Opened']).dt.date
open_dt = open_dt.reset_index().sort_values('Date Opened').set_index('Date Opened').groupby('Date Opened').nunique()
d_avg = open_dt.mean().round(0).item()
roll_avg = open_dt.resample('2W').mean()
print(roll_avg)
fig = plt.figure()
fig.suptitle(domain[0]+' Avg='+str(d_avg), fontsize=14)
ax = plt.plot(open_dt,color='b', marker='o', linestyle='-')
ax = plt.plot(roll_avg, color = 'r', linestyle = '--')
fig.savefig('output\\'+domain[0]+'_Overall.png')
plt.close()
这是正在读取的文件的头(数据合并)
Client # Solution Solution Family \
0 81983 Ambulatory EHR ASP Physician Practice
1 17235 Ambulatory EHR ASP Physician Practice
2 17235 Ambulatory EHR ASP Physician Practice
3 17235 Practice Management Physician Practice
4 17235 Practice Management Physician Practice
Team SR # Date Opened PWx Domain
0 PWx Mill Response ASP 416700000 6/20/2017 19:27 CPHYB_PR
1 Core T1 PWx 416700000 6/20/2017 18:33 NaN
2 Core T1 PWx 416700000 6/20/2017 18:33 CPHYB_PR
3 Claim Generation T3 PWx 416680000 6/19/2017 15:09 NaN
4 Claim Generation T3 PWx 416680000 6/19/2017 15:09 CPHYB_PR
dt.date
对象的索引不能识别为日期索引类型。它有dtype('O')
。 ^如果在中删除.dt.date
,则{相关问题 更多 >
编程相关推荐