在Pandas 1.0.1中,如果
df = df.merge(df2, on=some_column)
屈服
File /home/torstein/code/fintechdb/Sheets/sheets/gild.py, line 42, in gild
df = df.merge(df2, on=some_column)
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py, line 7297, in merge
validate=validate,
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 88, in merge
return op.get_result()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 643, in get_result
join_index, left_indexer, right_indexer = self._get_join_info()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 862, in _get_join_info
(left_indexer, right_indexer) = self._get_join_indexers()
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 841, in _get_join_indexers
self.left_join_keys, self.right_join_keys, sort=self.sort, how=self.how
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1311, in _get_join_indexers
zipped = zip(*mapped)
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1309, in <genexpr>
for n in range(len(left_keys))
File /home/torstein/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/merge.py, line 1918, in _factorize_keys
rlab = rizer.factorize(rk)
File pandas/_libs/hashtable.pyx, line 77, in pandas._libs.hashtable.Factorizer.factorize
File pandas/_libs/hashtable_class_helper.pxi, line 1817, in pandas._libs.hashtable.PyObjectHashTable.get_labels
File pandas/_libs/hashtable_class_helper.pxi, line 1732, in pandas._libs.hashtable.PyObjectHashTable._unique
File pandas/_libs/missing.pyx, line 360, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
虽然这样做有效:
df[some_column].fillna(np.nan, inplace=True)
df2[some_column].fillna(np.nan, inplace=True)
df = df.merge(df2, on=some_column)
# Works
如果相反,我会的
df[some_column].fillna(pd.NA, inplace=True)
然后错误返回
我认为我的数据中的
pd.NA
实例是有效的,因此我需要处理它们,而不是像fillna()
那样填充它们。如果你和我一样,只需使用pd.isna(val)
就可以将它从pd.NA
转换为True
或False
。只有您可以决定空值是T还是F,但下面是一个简单的示例:返回:
it is null
那么
返回:
it is not null
希望这有助于其他试图获得明确行动方案的人(Celius的答案是准确的,但我想为那些努力解决这一问题的人提供可操作的代码)
这与
pd.NA
在pandas 1.0.0中的实现以及熊猫团队如何决定它应该在布尔上下文中工作有关。此外,您还考虑到它是一个实验性功能,因此除了实验之外,不应使用它:在pandas文档的另一个链接中,它涵盖了working with missing values,我相信在这里可以找到您正在寻找的原因和答案:
此外,它还提供了一条宝贵的建议:
“这也意味着pd.NA不能用于将其计算为布尔值的上下文中,例如if条件:…其中条件可能是pd.NA。在这种情况下,可以使用isna()检查pd.NA或条件是pd.NA,例如通过事先填充缺少的值来避免。”
相关问题 更多 >
编程相关推荐