我有一个如下所示的数据框
B_ID no_show Session slot_num walkin ns_w c_ns_w c_walkin
1 0.4 S1 1 0.2 0.2 0.2 0.2
2 0.3 S1 2 0.5 -0.2 0.2 0.7
3 0.8 S1 3 0.5 0.3 0.5 1.2
4 0.3 S1 4 0.8 -0.5 0.0 2.0
5 0.6 S1 5 0.4 0.2 0.2 2.4
6 0.8 S1 6 0.2 0.6 0.8 2.6
7 0.9 S1 7 0.1 0.8 1.4 2.7
8 0.4 S1 8 0.5 -0.1 1.3 3.2
9 0.6 S1 9 0.1 0.5 1.8 3.3
12 0.9 S2 1 0.9 0.0 0.0 0.9
13 0.5 S2 2 0.4 0.1 0.1 1.3
14 0.3 S2 3 0.1 0.2 0.3 1.4
15 0.7 S2 4 0.4 0.3 0.6 1.8
20 0.7 S2 5 0.1 0.6 1.2 1.9
16 0.6 S2 6 0.3 0.3 1.5 2.2
17 0.8 S2 7 0.5 0.3 1.8 2.7
19 0.3 S2 8 0.8 -0.5 1.3 3.5
在哪里,
df[ns_w] = df['no_show'] - df['walkin']
c_ns_w = cumulaitve of ns_w
df['c_ns_w'] = df.groupby(['Session'])['ns_w'].cumsum()
c_walkin = cumulative of walkin
df['c_walkin'] = df.groupby(['Session'])['walkin'].cumsum()
根据上面的内容,我想计算两列u_ns_w
和u_c_walkin
当u_c_walkin > 0.9
创建一个带有no_show = 0
的新行时,walkin=0
和所有其他值将与上述行相同。其中B_ID = walkin1, 2, etc,
并从上述u_c_walkin
中减去1
同时u_c_ns_w > 0.8
添加一个新行,其中包含B_ID = overbook1, 2 etc
,以及与上述行相同的所有其他值,并从上面的u_c_ns_w
中减去0.5
预期产出:
B_ID no_show Session slot_num walkin ns_w c_ns_w c_walkin u_c_walkin u_c_ns_w
1 0.4 S1 1 0.2 0.2 0.2 0.2 0.2 0.2
2 0.3 S1 2 0.5 -0.2 0.2 0.7 0.7 0.2
3 0.8 S1 3 0.5 0.3 0.5 1.2 1.2 0.5
walkin1 0.0 S1 3 0.0 0.3 0.5 1.2 0.2 0.5
4 0.3 S1 4 0.8 -0.5 0.0 2.0 1.0 0.0
walkin2 0.0 S1 4 0.0 -0.5 0.0 2.0 0.0 0.0
5 0.6 S1 5 0.4 0.2 0.2 2.4 0.4 0.2
6 0.8 S1 6 0.2 0.6 0.8 2.6 0.6 0.8
7 0.9 S1 7 0.1 0.8 1.4 2.7 0.7 1.4
overbook1 0.5 S1 7 0.0 0.5 1.4 2.7 0.7 0.9
8 0.4 S1 8 0.5 -0.1 1.3 3.2 1.2 0.8
walkin3 0.0 S1 8 0.0 -0.1 1.3 3.2 0.2 0.8
9 0.6 S1 9 0.1 0.5 1.8 3.3 0.1 1.3
overbook2 0.5 S1 9 0.0 0.5 1.8 3.3 0.1 0.8
12 0.9 S2 1 0.9 0.0 0.0 0.9 0.9 0.0
13 0.5 S2 2 0.4 0.1 0.1 1.3 1.3 0.1
walkin1 0.0 S2 2 0.0 0.1 0.1 1.3 0.3 0.1
14 0.3 S2 3 0.1 0.2 0.3 1.4 0.4 0.3
15 0.7 S2 4 0.4 0.3 0.6 1.8 0.8 0.6
20 0.7 S2 5 0.1 0.6 1.2 1.9 0.9 1.2
overbook1 0.5 S2 5 0.0 0.5 1.2 1.9 0.9 0.7
16 0.6 S2 6 0.3 0.3 1.5 2.2 1.2 1.0
walkin2 0.0 S2 6 0.3 0.3 1.5 2.2 0.2 1.0
overbook2 0.5 S2 6 0.0 0.5 1.5 2.2 0.2 0.5
17 0.8 S2 7 0.5 0.3 1.8 2.7 0.7 0.8
19 0.3 S2 8 0.8 -0.5 1.3 3.5 1.5 0.3
walkin3 0.0 S2 8 0.8 -0.5 1.3 3.5 0.5 0.3
我尝试在下面的代码中创建walkin行,但无法创建overbook行
def create_u_columns (ser):
l_index = []
arr_ns = ser.to_numpy()
# array for latter insert
arr_idx = np.zeros(len(ser), dtype=int)
walkin_id = 1
for i in range(len(arr_ns)-1):
if arr_ns[i]>0.8:
# remove 1 to u_no_show
arr_ns[i+1:] -= 1
# increment later idx to add
arr_idx[i] = walkin_id
walkin_id +=1
#return a dataframe with both columns
return pd.DataFrame({'u_cumulative': arr_ns, 'mask_idx':arr_idx}, index=ser.index)
df[['u_c_walkin', 'mask_idx']]= df.groupby(['Session'])['c_walkin'].apply(create_u_columns)
# select the rows
df_toAdd = df.loc[df['mask_idx'].astype(bool), :].copy()
# replace the values as wanted
df_toAdd['no_show'] = 0
df_toAdd['walkin'] = 0
df_toAdd['EpisodeNumber'] = 'walkin'+df_toAdd['mask_idx'].astype(str)
df_toAdd['u_c_walkin'] -= 1
# add 0.5 to index for later sort
df_toAdd.index += 0.5
new_df = pd.concat([df,df_toAdd]).sort_index()\
.reset_index(drop=True).drop('mask_idx', axis=1)
在这里,您可以通过这种方式修改函数,以同时执行这两项检查。请检查它是否正是您想要为walkin和overbook数据帧应用的条件
你会得到:
相关问题 更多 >
编程相关推荐