我可以解决这个问题,但不是用一种Python的方式。给定以下数据帧:
time rssi key1 key2 CMA
0 0.021 -71 P A NaN
1 0.022 -60 Q A NaN
2 0.025 -56 P B NaN
3 0.12 -70 Q B NaN
4 0.167 -65 P A NaN
5 0.210 -55 P B NaN
6 0.211 -74 Q A NaN
7 0.213 -62 Q B NaN
...
逐行计算RSSI的累计移动平均值(CMA),将该值放入RSSI平均值列。迭代时间越长,但按key1
,key2
分组。这相当于要计算四个CMA:(P,A)
、(P,B)
、(Q,A)
、(Q,B)
。最后,计算的CMA应放入CMA列。你知道吗
注1:我知道RSSI平均值不能用这个公式计算,我不在乎。你知道吗
注2:CMA公式为avg(n) = (avg(n-1) * (n-1) + value(n))/n
示例1:
定义groupby()
策略。你知道吗
time rssi key1 key2 CMA
0 0.021 -71 P A NaN <<-- first value can stay NaN or be default to rssi (i.e. -71)
4 0.167 -65 P A -68
...
示例2:
期望输出
time rssi key1 key2 CMA
0 0.021 -71 P A NaN
1 0.022 -60 Q A NaN
2 0.025 -56 P B NaN
3 0.12 -70 Q B NaN
4 0.167 -65 P A -68
5 0.210 -55 P B -55.5
6 0.211 -74 Q A -67
7 0.213 -62 Q B -66
...
到目前为止,这是我能想到的
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['time'] = [0.021,0.022,0.025,0.12,0.167,0.210,0.211,0.213]
df['rssi'] = [-71,-60,-56,-70,-65,-55,-74,-62]
df['key1'] = ['P','Q','P','Q','P','P','Q','Q']
df['key2'] = ['A','A','B','B','A','B','A','B']
df["CMA"] = np.nan
for key, grp in df.groupby(['key1', 'key2']):
i = 0
old_index = 0
for index, row in grp.iterrows():
if i == 0:
# allowed alternative
df.at[index,'CMA'] = grp.at[index,'rssi']
old_index = index
else:
df.at[index,'CMA'] = ((df.at[old_index,'CMA'] * i) + df.at[index,'rssi']) / (i+1)
old_index = index
i += 1
print df
很管用,但很难看。必须有一个不那么痛苦的方式来实现同样的一个更Python的方式。如何在不显式设置该列的每个单元格值的情况下改进这一点?
您可以使用
reset_index
执行groupby().expanding().mean()
:输出:
相关问题 更多 >
编程相关推荐