<h2>更新</h2>
<p>我能够在分组和聚合数据帧后重现您的错误</p>
<pre><code>>>> import pandas as pd
>>> data = pd.DataFrame({
... "temp_playlist": [0] * 15,
... "objId": ['o1'] * 2 + ['o2'] * 2 + ['o3'] * 2 + ['o4'] * 3 + ['o5'] * 2 + ['o6'] * 2 + [pd.NA] * 2,
... "vals": [0, 6, 1, 4, 2, 5, 8, 9, 12, 10, 13, 11, 14, 3, 7]
... })
>>> df = data.groupby(["temp_playlist", "objId"], dropna=False).agg(list)
>>> df.loc[(0, pd.NA)]
Traceback (most recent call last):
File "/home/ec2-user/miniconda3/envs/so-pandas-nan-index/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: <NA>
</code></pre>
<p>不过,传递一个explit多索引是可行的</p>
<pre><code>>>> df.loc[pd.MultiIndex.from_tuples([(0, pd.NA)], names=["temp_playlist", "objId"])]
vals
temp_playlist objId
0 NaN [3, 7]
>>> df.loc[pd.MultiIndex.from_tuples([(0, pd.NA)])]
vals
0 NaN [3, 7]
</code></pre>
<p>使用单个元组返回数据帧也是如此。注意使用<code>[[]]</code>返回一个数据帧</p>
<pre><code>>>> df.loc[[(0, pd.NA)]]
vals
temp_playlist objId
0 NaN [3, 7]
</code></pre>
<p>与<a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reindex.html" rel="nofollow noreferrer">^{<cd2>}</a>一样(另见<a href="https://pandas.pydata.org/docs/user_guide/basics.html#reindexing-and-altering-labels" rel="nofollow noreferrer">user guide on reindexing</a>)</p>
<pre><code>>>> df.reindex([(0, pd.NA)])
vals
temp_playlist objId
0 NaN [3, 7]
</code></pre>
<h2>再现错误的最初尝试</h2>
<p>我无法重现你的错误。您可以在下面看到,使用<code>df.loc[(0, np.nan)]</code>是有效的</p>
<pre><code>Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import pandas as pd
>>> nan_index = pd.MultiIndex.from_tuples([(0, 'o1'),
(0, 'o2'),
(0, 'o3'),
(0, 'o4'),
(0, 'o5'),
(0, 'o6'),
(0, np.nan)])
>>> print(nan_index)
MultiIndex([(0, 'o1'),
(0, 'o2'),
(0, 'o3'),
(0, 'o4'),
(0, 'o5'),
(0, 'o6'),
(0, nan)],
)
>>> rng = np.random.default_rng(42)
>>> vals = [rng.choice(20, 2) for i in range(nan_index.shape[0])]
>>> print(vals)
[array([ 1, 15]), array([13, 8]), array([ 8, 17]), array([ 1, 13]), array([4, 1]), array([10, 19]), array([14, 15])]
>>> df = pd.DataFrame({"vals": vals}, index=nan_index)
>>> print(df)
vals
0 o1 [1, 15]
o2 [13, 8]
o3 [8, 17]
o4 [1, 13]
o5 [4, 1]
o6 [10, 19]
NaN [14, 15]
>>> print(df.loc[(0, 'o1')])
vals [1, 15]
Name: (0, o1), dtype: object
>>> print(df.loc[(0, np.nan)])
vals [14, 15]
Name: (0, nan), dtype: object
>>> print(pd.__version__)
1.3.1
</code></pre>
<p>然后我注意到你的索引被打印为<code>(0, nan)</code>,而我的是<code>(0, np.nan)</code>。区别在于我使用了<code>np.nan</code>,我怀疑你的是<code>pd.NA</code></p>
<pre><code>>>> nan_index = pd.MultiIndex.from_tuples([(0, 'o1'),
(0, 'o2'),
(0, 'o3'),
(0, 'o4'),
(0, 'o5'),
(0, 'o6'),
(0, pd.NA)])
>>> nan_index
MultiIndex([(0, 'o1'),
(0, 'o2'),
(0, 'o3'),
(0, 'o4'),
(0, 'o5'),
(0, 'o6'),
(0, nan)],
)
>>> df = pd.DataFrame({"vals": vals}, index=nan_index)
>>> df
vals
0 o1 [1, 15]
o2 [13, 8]
o3 [8, 17]
o4 [1, 13]
o5 [4, 1]
o6 [10, 19]
NaN [14, 15]
</code></pre>
<p>然而,这并没有解决分歧。我仍然能够使用<code>df.loc[(0, np.nan)]</code></p>
<pre><code>>>> df.loc[(0, pd.NA)]
vals [14, 15]
Name: (0, nan), dtype: object
>>> df.loc[(0, np.nan)]
vals [14, 15]
Name: (0, nan), dtype: object
</code></pre>
<p>此外,我还能够使用<code>df.loc[(0, None)]</code></p>
<pre><code>>>> df.loc[(0, None)]
vals [14, 15]
Name: (0, nan), dtype: object
</code></pre>
<p>只是确认一下,<code>np.nan</code>、<code>pd.NA</code>和<code>None</code>都是不同的对象。熊猫与<code>DataFrame.loc</code>一起使用时,必须以同样的方式对待它们</p>
<pre><code>>>> pd.NA is np.nan
False
>>> pd.NA is None
False
>>> np.nan is None
False
>>> type(pd.NA)
<class 'pandas._libs.missing.NAType'>
>>> type(np.nan)
<class 'float'>
</code></pre>