Pandas apply（）和aggregate（）函数之间的差异

1条回答

网友

1楼 · 发布于 2024-05-29 09:32:43

agg（aggregate的缩写）和apply有两个版本：第一个版本定义在groupby对象上，第二个版本定义在dataframe上。在

如果考虑groupby.agg和{}，主要区别在于应用程序是灵活的（docs）：

Some operations on the grouped data might not fit into either the aggregate or transform categories. Or, you may simply want GroupBy to infer how to combine the results. For these, use the apply function, which can be substituted for both aggregate and transform in many standard use cases.
Note: apply can act as a reducer, transformer, or filter function, depending on exactly what is passed to apply. So depending on the path taken, and exactly what you are grouping. Thus the grouped columns(s) may be included in the output as well as set the indices.

有关如何自动更改返回类型的说明，请参见Python Pandas : How to return grouped lists in a column as a dict。在

另一方面，groupby.agg非常适合应用cython优化函数（即能够非常快速地计算'sum'、'mean'、'std'等）。它还允许计算不同列上的多个（不同）函数。例如

df.groupby('some_column').agg({'first_column': ['mean', 'std'],
                               'second_column': ['sum', 'sem']}

计算第一列的平均值和标准差，第二列的平均值和标准差。更多示例请参见dplyr summarize equivalent in pandas。在

这些差异也在What is the difference between pandas agg and apply function?中进行了总结，但其中一个重点是groupby.agg和{}之间的区别。在

DataFrame.agg是版本0.20中的新功能。以前，我们不能对不同的列应用多个不同的函数，因为只有groupby对象才有可能。现在，您可以通过计算数据帧的列上的多个不同函数来汇总数据帧。来自Is there a pandas equivalent of dplyr::summarise?的示例：

^{pr2}$

对于DataFrame.apply，这是不可能的。它要么逐列执行，要么逐行执行，并对该列/行执行相同的函数。对于像lambda x: x**2这样的单个函数，它们会产生相同的结果，但它们的预期用途却大不相同。在

相关问题更多 >

编程相关推荐

热门问题

热门文章