我有一个数据框架,并希望聚合3列的日期,并添加一些计算列的结尾。你知道吗
数据帧列:
cols = ["region_2",
"trade_flag",
"trade_target",
"broker",
"trade_shares",
"total_value",
"commission_in_gbp",
"IS/Order Start PTA - Realized Cost/Sh",
"IS/Order Start PTA - Realized Net Cost/Sh",
"IS/Order Start PTA - Base Bench Price",
"IS/Order Start PTA - P/L"]
输入示例:
region_2 trade_flag trade_target broker trade_shares total_value commission_in_gbp IS/Order Start PTA - Realized Cost/Sh IS/Order Start PTA - Realized Net Cost/Sh IS/Order Start PTA - Base Bench Price IS/Order Start PTA - P/L count
0 EMEA flag1 target1 broker1 3900 39532 0.00406 -0.067 -0.067 10.2037 -261.91 1
1 APAC flag2 target2 broker2 1700 17232 0.00406 -0.067 -0.067 10.2037 -114.17 1
2 AMER flag1 target1 broker3 1400 14191 0.00406 -0.067 -0.067 10.2037 -94.02 1
3 EMEA flag2 target2 broker2 2000 20273 0.00406 -0.067 -0.067 10.2037 -134.31 1
期望输出:
region_2 | trade_flag | broker | count | total_value | perf | net perf
末尾的perf列是加权平均计算。你知道吗
下面的代码是另一个不起作用的示例(KeyError)
df['count'] = 1
df['perf'] = ""
df['net perf'] = ""
wm = lambda x: x['IS/Order Start PTA - Realized Cost/Sh'] * x['trade_shares'] * 10000 / x['IS/Order Start PTA - Base Bench Price'] * x['trade_shares']
wm2 = lambda x: x['IS/Order Start PTA - Realized Net Cost/Sh'] * x['trade_shares'] * 10000 / x['IS/Order Start PTA - Base Bench Price'] * x['trade_shares']
f = {'trade_shares': ['sum'],
'total_value': ['sum'],
'count': ['sum'],
'perf': {'weighted mean' : wm},
'net perf': {'weighted mean' : wm2}}
df = df.groupby(['region_2', 'trade_flag', 'broker']).agg(f)
df = df[['region_2', 'trade_flag', 'broker', 'count', 'total_value', 'actual', 'net']]
您可以使用pivot表而不是groupby
尽管它可以帮助查看实际的错误消息和示例输入,以确定这是否是实际问题。你知道吗
您需要^{} ,因为^{} 分别处理每一列,所以
KeyError
:相关问题 更多 >
编程相关推荐