如何从数据框中弹出行?

2024-05-23 21:40:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我找到了^{}的文档,但是在尝试并检查了source code之后,它似乎没有做我想要的事情。

如果我生成这样的数据帧:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10,6))
# Make a few areas have NaN values
df.iloc[1:3,1] = np.nan
df.iloc[5,3] = np.nan
df.iloc[7:9,5] = np.nan


>>> df
          0         1         2         3         4         5
0  0.772762 -0.442657  1.245988  1.102018 -0.740836  1.685598
1 -0.387922       NaN -1.215723 -0.106875  0.499110  0.338759
2  0.567631       NaN -0.353032 -0.099011 -0.698925 -1.348966
3  1.320849  1.084405 -1.296177  0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964  1.116673  0.392217  1.280808
5 -1.249192 -0.035932 -1.330916       NaN -0.135720 -0.506016
6  0.406344  1.416579  0.122019  0.648851 -0.305359 -1.253580
7 -0.092440 -0.243593  0.468463 -1.689485  0.667804       NaN
8 -0.110819 -0.627777 -0.302116  0.630068  2.567923       NaN
9  1.884069 -0.393420 -0.950275  0.151182 -1.122764  0.502117

如果我想在一个步骤中删除选定的行并将它们分配给一个单独的对象,我需要一个pop行为,如下所示:

# rows in column 5 which have NaN values
>>> df[df[5].isnull()].index
Int64Index([7, 8], dtype='int64')

# remove them from the dataframe, assign them to a separate object
>>> nan_rows = df.pop(df[df[5].isnull()].index)

但是,这似乎不受支持。相反,我似乎被迫分两步来做,这似乎有点不雅观。

# get the NaN rows
>>> nan_rows = df[df[5].isnull()]

>>> nan_rows
          0         1         2         3         4   5
7 -0.092440 -0.243593  0.468463 -1.689485  0.667804 NaN
8 -0.110819 -0.627777 -0.302116  0.630068  2.567923 NaN

# remove from orignal df
>>> df = df.drop(nan_rows.index)

>>> df
          0         1         2         3         4         5
0  0.772762 -0.442657  1.245988  1.102018 -0.740836  1.685598
1 -0.387922       NaN -1.215723 -0.106875  0.499110  0.338759
2  0.567631       NaN -0.353032 -0.099011 -0.698925 -1.348966
3  1.320849  1.084405 -1.296177  0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964  1.116673  0.392217  1.280808
5 -1.249192 -0.035932 -1.330916       NaN -0.135720 -0.506016
6  0.406344  1.416579  0.122019  0.648851 -0.305359 -1.253580
9  1.884069 -0.393420 -0.950275  0.151182 -1.122764  0.502117

是否有一个内置的单步方法?还是你应该这样做?


Tags: importdfindexhaveasnpnanpop
1条回答
网友
1楼 · 发布于 2024-05-23 21:40:29

pop源代码:

    def pop(self, item):
        """
        Return item and drop from frame. Raise KeyError if not found.
        """
        result = self[item]
        del self[item]
        try:
            result._reset_cacher()
        except AttributeError:
            pass

        return result
File:      c:\python\lib\site-packages\pandas\core\generic.py

如果item不是简单的列名,那么del肯定不起作用。传递一个简单的列名,或者分两步完成。

相关问题 更多 >