为什么pd系列（[np.nan]）| pd系列（[True]）的计算结果为False？

2条回答

网友

1楼 · 编辑于 2024-04-20 15:25:09

比较您的案例（与显式dtype以强调推断的案例）：

In[11]: pd.Series([np.nan], dtype=float) | pd.Series([True])

Out[11]: 
0    False
dtype: bool

有一个类似的（现在只有dtype是bool）：

In[12]: pd.Series([np.nan], dtype=bool) | pd.Series([True])

Out[12]: 
0    True
dtype: bool

你看到区别了吗

解释：

网友

2楼 · 编辑于 2024-04-20 15:25:09

我认为这是因为np.nan有float的元类，我猜覆盖__bool__是非零的：

np.nan.__bool__() == True

同样地：

>>>np.nan or None
nan

熊猫的解决方案是：

pd.Series([np.nan]).fillna(False) | pd.Series([True])

编辑***

为清楚起见，在方法：_bool_method_SERIES的.../pandas/core/ops.py行1816中的pandas 0.24.1中有一个赋值：

    fill_bool = lambda x: x.fillna(False).astype(bool)

这就是你描述的行为的来源。也就是说，它被特意设计成np.nan被视为False值（无论何时执行或操作）