我找到了^{
如果我生成这样的数据帧:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10,6))
# Make a few areas have NaN values
df.iloc[1:3,1] = np.nan
df.iloc[5,3] = np.nan
df.iloc[7:9,5] = np.nan
>>> df
0 1 2 3 4 5
0 0.772762 -0.442657 1.245988 1.102018 -0.740836 1.685598
1 -0.387922 NaN -1.215723 -0.106875 0.499110 0.338759
2 0.567631 NaN -0.353032 -0.099011 -0.698925 -1.348966
3 1.320849 1.084405 -1.296177 0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964 1.116673 0.392217 1.280808
5 -1.249192 -0.035932 -1.330916 NaN -0.135720 -0.506016
6 0.406344 1.416579 0.122019 0.648851 -0.305359 -1.253580
7 -0.092440 -0.243593 0.468463 -1.689485 0.667804 NaN
8 -0.110819 -0.627777 -0.302116 0.630068 2.567923 NaN
9 1.884069 -0.393420 -0.950275 0.151182 -1.122764 0.502117
如果我想在一个步骤中删除选定的行并将它们分配给一个单独的对象,我需要一个pop
行为,如下所示:
# rows in column 5 which have NaN values
>>> df[df[5].isnull()].index
Int64Index([7, 8], dtype='int64')
# remove them from the dataframe, assign them to a separate object
>>> nan_rows = df.pop(df[df[5].isnull()].index)
但是,这似乎不受支持。相反,我似乎被迫分两步来做,这似乎有点不雅观。
# get the NaN rows
>>> nan_rows = df[df[5].isnull()]
>>> nan_rows
0 1 2 3 4 5
7 -0.092440 -0.243593 0.468463 -1.689485 0.667804 NaN
8 -0.110819 -0.627777 -0.302116 0.630068 2.567923 NaN
# remove from orignal df
>>> df = df.drop(nan_rows.index)
>>> df
0 1 2 3 4 5
0 0.772762 -0.442657 1.245988 1.102018 -0.740836 1.685598
1 -0.387922 NaN -1.215723 -0.106875 0.499110 0.338759
2 0.567631 NaN -0.353032 -0.099011 -0.698925 -1.348966
3 1.320849 1.084405 -1.296177 0.681111 -1.941855 -0.950346
4 -0.026818 -1.933629 -0.693964 1.116673 0.392217 1.280808
5 -1.249192 -0.035932 -1.330916 NaN -0.135720 -0.506016
6 0.406344 1.416579 0.122019 0.648851 -0.305359 -1.253580
9 1.884069 -0.393420 -0.950275 0.151182 -1.122764 0.502117
是否有一个内置的单步方法?还是你应该这样做?
pop源代码:
如果
item
不是简单的列名,那么del
肯定不起作用。传递一个简单的列名,或者分两步完成。相关问题 更多 >
编程相关推荐