<p>这是内部细节,我不认为这会被记录下来</p>
<p>pandas dev以这种方式处理这些字符串,即<code>'sum'</code>,<code>'mean'</code>。它们有一个映射,将函数映射到该函数的内部cythonised实现</p>
<p>摘自<a href="https://github.com/pandas-dev/pandas/blob/master/pandas/core/base.py#L161" rel="nofollow noreferrer">^{<cd3>}</a></p>
<pre><code>_cython_table = {
builtins.sum: "sum",
builtins.max: "max",
builtins.min: "min",
np.all: "all",
np.any: "any",
np.sum: "sum",
np.nansum: "sum",
np.mean: "mean",
np.nanmean: "mean",
np.prod: "prod",
np.nanprod: "prod",
np.std: "std",
np.nanstd: "std",
np.var: "var",
np.nanvar: "var",
np.median: "median",
np.nanmedian: "median",
np.max: "max",
np.nanmax: "max",
np.min: "min",
np.nanmin: "min",
np.cumprod: "cumprod",
np.nancumprod: "cumprod",
np.cumsum: "cumsum",
np.nancumsum: "cumsum",
}
</code></pre>
<p>所以,<code>Series.agg(sum)</code>,<code>Series.agg('sum')</code>,<code>Series.agg(np.sum)</code>,<code>Series.agg(np.nansum)</code>都调用相同的内部cythonized函数</p>
<p>摘自<a href="https://github.com/pandas-dev/pandas/blob/master/pandas/core/base.py#L331" rel="nofollow noreferrer">^{<cd3>}</a></p>
<pre><code> def _get_cython_func(self, arg: Callable) -> Optional[str]:
"""
if we define an internal function for this argument, return it
"""
return self._cython_table.get(arg)
</code></pre>
<p>你可以在<a href="https://github.com/pandas-dev/pandas/blob/master/pandas/core/aggregation.py" rel="nofollow noreferrer">^{<cd9>}</a>中找到它们是如何处理的,它们使用<code>getattr</code>在这里,似乎cythonized func是定义的类属性。我没有找到好的起点,但最好是在<a href="https://github.com/pandas-dev/pandas/blob/v1.1.4/pandas/core/generic.py" rel="nofollow noreferrer">^{<cd11>}</a>看看<a href="https://github.com/pandas-dev/pandas/blob/v1.1.4/pandas/core/generic.py#L11455" rel="nofollow noreferrer">^{<cd12>}</a></p>
<pre><code>def aggregate(
obj: AggObjType,
arg: AggFuncType,
*args,
**kwargs,
):
...
...
if callable(arg):
f = obj._get_cython_func(arg)
if f and not args and not kwargs:
return getattr(obj, f)(), None
...
...
</code></pre>