如何选择PandasNaN前后的行?

2024-05-19 20:54:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框,看起来像这样:

    Name      Age       Job         
0   Alex      20        Student
1   Sara      21        Doctor
2   john      23        NaN
3   kevin     22        Teacher
4   Rosa      20        senior manager
5   johanes   25        Dentist
6   lina      23        Student
7   yaser     25        Pilot
8   jason     20        Manager
9   Ali       23        NaN
10  Ahmad     21        Professor
11  Joe       24        NaN
12  Donald    29        Waiter
.
.
.
.

我想选择在列Job中具有NaN值的行之前和之后的行,该行本身具有NaN值。为此,我有以下代码:

Rows = df[df. Shift(1, fill_value="dummy").Job. isna() | df.Job. isna()| df. Shift(-1, fill_value="dummy"). df. isna()]
print(Rows)

结果是:

1   Sara      21        Doctor
2   john      23        NaN
3   kevin     22        Teacher
8   jason     20        Manager
9   Ali       23        NaN
10  Ahmad     21        Professor
11  Joe       24        NaN
12  Donald    29        Waiter

这里唯一的问题是第10行,它在结果中应该是双倍的,因为这一行是NaN之后的第9行的一倍,同时是NaN值之前的第11行(这一行位于具有NaN值的两行之间)。所以最后我想说:

1   Sara      21        Doctor
2   john      23        NaN
3   kevin     22        Teacher
8   jason     20        Manager
9   Ali       23        NaN
10  Ahmad     21        Professor
10  Ahmad     21        Professor
11  Joe       24        NaN
12  Donald    29        Waiter

因此,两行之间具有NaN值的每一行在结果中也应该是两次(或者应该是双重的)。有没有办法做到这一点?任何帮助都将不胜感激


Tags: dfmanagerjobnanalijohnjoedoctor
1条回答
网友
1楼 · 发布于 2024-05-19 20:54:04

^{}用于行的前、后和按条件匹配:

m = df.Job.isna()

df = pd.concat([df[m.shift(fill_value=False)],
                df[m.shift(-1, fill_value=False)],
                df[m]]).sort_index()
print (df)
      Name  Age        Job
1     Sara   21     Doctor
2     john   23        NaN
3    kevin   22    Teacher
8    jason   20    Manager
9      Ali   23        NaN
10   Ahmad   21  Professor
10   Ahmad   21  Professor
11     Joe   24        NaN
12  Donald   29     Waiter

相关问题 更多 >