Pandas选择所有不带NaN的列

网友

1楼 · 编辑于 2024-04-19 09:22:21

可以使用非NaN列创建

df = df[df.columns[~df.isnull().all()]]

或者

null_cols = df.columns[df.isnull().all()]
df.drop(null_cols, axis = 1, inplace = True)

如果希望基于特定百分比的nan删除列，则将数据超过90%的列称为空

cols_to_delete = df.columns[df.isnull().sum()/len(df) > .90]
df.drop(cols_to_delete, axis = 1, inplace = True)

网友

2楼 · 编辑于 2024-04-19 09:22:21

你应该试试df_notnull = df.dropna(how='all') 这将只得到非空行。

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

网友

3楼 · 编辑于 2024-04-19 09:22:21

我想你不可能得到所有没有任何NaN的列。如果是这样的话，首先可以使用~col.isnull.any()获得没有任何NaN的列的名称，然后使用该列。

我可以用下面的代码思考：

import pandas as pd

df = pd.DataFrame({
    'col1': [23, 54, pd.np.nan, 87],
    'col2': [45, 39, 45, 32],
    'col3': [pd.np.nan, pd.np.nan, 76, pd.np.nan,]
})

# This function will check if there is a null value in the column
def has_nan(col, threshold=0):
    return col.isnull().sum() > threshold

# Then you apply the "complement" of function to get the column with
# no NaN.

df.loc[:, ~df.apply(has_nan)]

# ... or pass the threshold as parameter, if needed
df.loc[:, ~df.apply(has_nan, args=(2,))]

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas选择所有不带NaN的列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >