import pandas as pd
df1 = pd.DataFrame({'animal': {0: 'bird', 1: 'bird', 2: 'bird', 3: 'bird', 4: 'bird', 5: 'bird', 6: 'dog', 7: 'dog',
8: 'dog', 9: 'dog',
10: 'dog', 11: 'dog'},
'cat1': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b', 5: 'b', 6: 'a', 7: 'a', 8: 'a', 9: 'b', 10: 'b',
11: 'b'},
'cat2': {0: 'x', 1: 'y', 2: 'z', 3: 'x', 4: 'y', 5: 'z', 6: 'x', 7: 'y', 8: 'z', 9: 'x', 10: 'y',
11: 'z'},
'val1': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10, 10: 11, 11: 12}})
df2 = pd.DataFrame({'cat3': {0: 'i', 1: 'i', 2: 'j', 3: 'j', 4: 'i', 5: 'i', 6: 'j', 7: 'j'},
'animal': {0: 'bird', 1: 'bird', 2: 'bird', 3: 'bird', 4: 'dog', 5: 'dog', 6: 'dog', 7: 'dog'},
'cat1': {0: 'a', 1: 'b', 2: 'a', 3: 'b', 4: 'a', 5: 'b', 6: 'a', 7: 'b'},
'val2': {0: 13, 1: 14, 2: 15, 3: 16, 4: 17, 5: 18, 6: 19, 7: 20}})
m = pd.merge(df1, df2[df2['cat3']=='i'], how='left', on=['animal', 'cat1'])
print('df1, a few records\n', df1.head(3))
print('df2, a few records\n', df2.head())
print('merged and grouped\n', m.groupby(['animal', 'cat1', 'cat2']).sum())
--------
**OUTPUT:**
df1, a few records
animal cat1 cat2 val1
0 bird a x 1
1 bird a y 2
2 bird a z 3
df2, a few records
cat3 animal cat1 val2
0 i bird a 13
1 i bird b 14
2 j bird a 15
3 j bird b 16
4 i dog a 17
merged and grouped
val1 val2
animal cat1 cat2
bird a x 1 13
y 2 13
z 3 13
b x 4 14
y 5 14
z 6 14
dog a x 7 17
y 8 17
z 9 17
b x 10 18
y 11 18
z 12 18
我想创建一个包含val2扩展的新列val3 在所有叶记录中,以val1加权
val3列的前4个值为:
2.166667 = 13 * (1/(1+2+3))
4.333333 = 13 * (2/(1+2+3))
6.5 = 13 * (3/(1+2+3))
3.7333 = 14 * (4/(4+5+6)))...
我该怎么做?我不知道从哪里开始
虽然在本例中我是针对单个cat3元素进行此操作的,但实际上cat3对于dataframes
和每个元素都是通用的,并且每个元素都会像本例中一样扩展
在这种情况下,可以使用
groupby().transform()
:输出:
更新:如果您也希望为所有
cat3
执行此操作,则只需将cat3
放入groupby的:输出:
相关问题 更多 >
编程相关推荐