我对Python和熊猫还不熟悉,所以我的怀疑也很愚蠢
问题:
所以我有两个数据帧,比如说df1
和df2
,其中
df1
就像
treatment1 treatment2 value comparision test adjustment statsig p_value
0 Treatment Control 0.795953 Treatment:Control t-test Benjamini-Hochberg False 0.795953
1 Treatment2 Control 0.795953 Treatment2:Control t-test Benjamini-Hochberg False 0.795953
2 Treatment2 Treatment 0.795953 Treatment2:Treatment t-test Benjamini-Hochberg False 0.795953
而df2
就像
group_type metric
0 Treatment 31.0
1 Treatment2 83.0
2 Treatment 51.0
3 Treatment 20.0
4 Control 41.0
.. ... ...
336 Treatment3 35.0
337 Treatment3 9.0
338 Treatment3 35.0
339 Treatment3 9.0
340 Treatment3 35.0
我想在df1
中添加一列mean_percentage_lift
,其中
lift_mean_percentage = (mean(treatment1)/mean(treatment2) -1) * 100
where `treatment1` and `treatment2` can be anything in `[Treatment, Control, Treatment2]`
我的方法:
我正在使用数据帧的assign
函数
df1.assign(mean_percentage_lift = lambda dataframe: lift_mean_percentage(df2, dataframe['treatment1'], dataframe['treatment2']))
在哪里
def lift_mean_percentage(df, treatment1, treatment2):
treatment1_data = df[df[group_type_col] == treatment1]
treatment2_data = df[df[group_type_col] == treatment2]
mean1 = treatment1_data['metric'].mean()
mean2 = treatment2_data['metric'].mean()
return (mean1/mean2 -1) * 100
但是我得到了这个错误Can only compare identically-labeled Series objects
treatment1_data = df[df[group_type_col] == treatment1]
。我做错了什么事了吗?还有别的选择吗
对于数据帧df2:
您可以尝试:
跑步:
结果是:
相关问题 更多 >
编程相关推荐