为什么pandas的groupby().transform()需要唯一的索引?

df = pd.DataFrame([[1,1], [1,2], [2,3], [3,4], [3,5]], columns='a b'.split()) df['partials'] = df.groupby('a')['b'].transform(np.cumsum) df

df = df.set_index('a') df['partials'] = df.groupby(level=0)['b'].transform(np.cumsum) df --------------------------------------------------------------------------- Exception Traceback (most recent call last) <ipython-input-146-d0c35a4ba053> in <module>() 3 4 df = df.set_index('a') ----> 5 df.groupby(level=0)['b'].transform(np.cumsum) /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/groupby.pyc in transform(self, func, *args, **kwargs) 1542 res = wrapper(group) 1543 # result[group.index] = res -> 1544 indexer = self.obj.index.get_indexer(group.index) 1545 np.put(result, indexer, res) 1546 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.pyc in get_indexer(self, target, method, limit) 847 848 if not self.is_unique: --> 849 raise Exception('Reindexing only valid with uniquely valued Index ' 850 'objects') 851 Exception: Reindexing only valid with uniquely valued Index objects

1条回答

网友

1楼 · 发布于 2024-04-19 08:14:45

这是一个bug，因为在pandas中修复了（当然是在0.15.2中，IIRC是在0.14中修复的），所以您不应该再看到这个异常。

作为解决方法，在早期的pandas中可以使用apply：

In [10]: g = df.groupby(level=0)['b']

In [11]: g.apply(np.cumsum)
Out[11]:
a
1    1
1    3
2    3
3    4
3    9
dtype: int64

你可以把这个分配给df中的一个列

In [12]: df['partial'] = g.apply(np.cumsum)

相关问题更多 >

编程相关推荐

热门问题

热门文章