考虑以下示例:
df = pd.DataFrame(
{'a': ['one', 'one', 'one', 'one', 'two', 'two', 'two', 'three', 'four'],
'b': ['x', 'y','x', 'y', 'x', 'y', 'x', 'x', 'x'],
'c': np.random.randn(9)}
)
df['sum_c_3'] = 99.99
输出:
>>> df
a b c sum_c_3
0 one x 1.296379 99.99
1 one y 0.201266 99.99
2 one x 0.953963 99.99
3 one y 0.322922 99.99
4 two x 0.887728 99.99
5 two y -0.154389 99.99
6 two x -2.390790 99.99
7 three x -1.218706 99.99
8 four x -0.043964 99.99
现在我要做很多操作,所以举一个例子,我将计算3条next记录的总和,并将结果保存在新列中,如下所示:
for w in ['one','two','three','four']:
x = df.loc[df['a']==w]
size = x.iloc[:]['a'].count()
print("Records %s: %s" %(w,size))
target_column = x.columns.get_loc('c')
for i in range(0,size):
idx = x.index
acum = x.iloc[i:i+3,target_column].sum()
x.loc[x.loc[idx,'sum_c_3'].index[i],'sum_c_3'] = acum
print (x)
输出:
Records one: 4
a b c sum_c_3
0 one x 1.296379 2.451607
1 one y 0.201266 1.478151
2 one x 0.953963 1.276885
3 one y 0.322922 0.322922
Records two: 3
a b c sum_c_3
4 two x 0.887728 -1.657452
5 two y -0.154389 -2.545180
6 two x -2.390790 -2.390790
Records three: 1
a b c sum_c_3
7 three x -1.218706 -1.218706
Records four: 1
a b c sum_c_3
8 four x -0.043964 -0.043964
最后我的疑问是:如何更新原始数据帧?你知道吗
我能自动切片并保存总和吗?或者我应该使用series(slice)by索引进行更新?你知道吗
原版保持不变,无任何更新,请参见:
>>> df
a b c sum_c_3
0 one x 1.296379 99.99
1 one y 0.201266 99.99
2 one x 0.953963 99.99
3 one y 0.322922 99.99
4 two x 0.887728 99.99
5 two y -0.154389 99.99
6 two x -2.390790 99.99
7 three x -1.218706 99.99
8 four x -0.043964 99.99
>>>
在
for loop
末尾添加update
相关问题 更多 >
编程相关推荐