将函数应用于Pandas数据帧的列，条件是数据类型

网友

1楼 · 编辑于 2024-04-19 17:45:48

您希望在数据帧上使用apply-fn，但是您忘记了基本类型转换（在大多数OOP语言中都是一个问题）。快速解决方法如下：

def selectiveapply(row):
    return(type(row[0]))
toydf=toydf.T
toydf["type"]=toydf.apply(selectiveapply,axis=1)

通过设置轴=0，也可以按列应用apply。玩玩一下函数，你最终会找到答案的。在

网友

2楼 · 编辑于 2024-04-19 17:45:48

这个comment是正确的。这种行为是故意的。Pandas为所有给定的数据类型“应用”类型层次结构中最高的类型。在

考虑只将函数应用于“A”

df[['A']].apply(dtype_fn)
int64

A    int64
dtype: object

同样，只有“A”和“B”

^{pr2}$

由于您有多种类型，包括原始数据帧中的string，因此它们的通用类型都是object。在

现在这解释了这种行为，但我仍然需要解决这个问题。Pandas提供了一个有用的方法：^{}，它推断数据类型并执行“软转换”。在

如果确实需要函数中的类型，可以在调用dtype之前执行软转换。这将产生预期结果：

def dtype_fn(the_col):
     the_col = the_col.infer_objects()
     print(the_col.dtype)

     return(the_col.dtype)

df.apply(dtype_fn)
int64
float64
object
bool

A      int64
B    float64
C     object
D       bool
dtype: object

网友

3楼 · 编辑于 2024-04-19 17:45:48

您的dtype_fn的实际输入是Pandas系列对象。您可以通过稍微修改方法来访问基础类型。在

def dtype_fn(the_col):
    print(the_col.values.dtype)
    return(the_col.values.dtype)

有关为什么会出现这种情况的更多信息，您可以看看这个answer。上面写着

This is not an error but is due to the numpy dtype representation: https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html.