在合并列中将两个数据帧与pd.NA合并会产生“TypeError:NA的布尔值不明确”

2024-04-25 14:52:20 发布

您现在位置:Python中文网/ 问答频道 /正文

在Pandas 1.0.1中,如果

df = df.merge(df2, on=some_column)

屈服

File /home/torstein/code/fintechdb/Sheets/sheets/gild.py, line 42, in gild
    df = df.merge(df2, on=some_column)
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py, line 7297, in merge
    validate=validate,
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 88, in merge
    return op.get_result()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 643, in get_result
    join_index, left_indexer, right_indexer = self._get_join_info()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 862, in _get_join_info
    (left_indexer, right_indexer) = self._get_join_indexers()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 841, in _get_join_indexers
    self.left_join_keys, self.right_join_keys, sort=self.sort, how=self.how
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1311, in _get_join_indexers
    zipped = zip(*mapped)
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1309, in <genexpr>
    for n in range(len(left_keys))
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1918, in _factorize_keys
    rlab = rizer.factorize(rk)
File pandas/_libs/hashtable.pyx, line 77, in pandas._libs.hashtable.Factorizer.factorize
File pandas/_libs/hashtable_class_helper.pxi, line 1817, in pandas._libs.hashtable.PyObjectHashTable.get_labels
File pandas/_libs/hashtable_class_helper.pxi, line 1732, in pandas._libs.hashtable.PyObjectHashTable._unique
File pandas/_libs/missing.pyx, line 360, in pandas._libs.missing.NAType.__bool__

TypeError: boolean value of NA is ambiguous

虽然这样做有效:

df[some_column].fillna(np.nan, inplace=True)
df2[some_column].fillna(np.nan, inplace=True)
df = df.merge(df2, on=some_column)
# Works

如果相反,我会的

df[some_column].fillna(pd.NA, inplace=True)

然后错误返回


Tags: inpycorepandasdfhomegetlib
2条回答

我认为我的数据中的pd.NA实例是有效的,因此我需要处理它们,而不是像fillna()那样填充它们。如果你和我一样,只需使用pd.isna(val)就可以将它从pd.NA转换为TrueFalse。只有您可以决定空值是T还是F,但下面是一个简单的示例:

val = pd.NA
if pd.isna(val) :
    print('it is null')
else :
    print('it is not null')

返回:it is null

那么

val = 7
if pd.isna(val) :
    print('it is null')
else :
    print('it is not null')

返回:it is not null

希望这有助于其他试图获得明确行动方案的人(Celius的答案是准确的,但我想为那些努力解决这一问题的人提供可操作的代码)

这与pd.NApandas 1.0.0中的实现以及熊猫团队如何决定它应该在布尔上下文中工作有关。此外,您还考虑到它是一个实验性功能,因此除了实验之外,不应使用它:

Warning Experimental: the behaviour of pd.NA can still change without warning.

在pandas文档的另一个链接中,它涵盖了working with missing values,我相信在这里可以找到您正在寻找的原因和答案:

NA in a boolean context: Since the actual value of an NA is unknown, it is ambiguous to convert NA to a boolean value. The following raises an error: TypeError: boolean value of NA is ambiguous

此外,它还提供了一条宝贵的建议:

“这也意味着pd.NA不能用于将其计算为布尔值的上下文中,例如if条件:…其中条件可能是pd.NA。在这种情况下,可以使用isna()检查pd.NA或条件是pd.NA,例如通过事先填充缺少的值来避免。”

相关问题 更多 >