基于包含len（string）giving KeyE的条件表达式从pandas数据框中删除行

网友

1楼 · 编辑于 2024-04-19 03:40:24

当您执行len(df['column name'])时，您只得到一个数字，即数据帧中的行数（即列本身的长度）。如果要对列中的每个元素应用len，请使用df['column name'].map(len)。所以试试看

df[df['column name'].map(len) < 2]

网友

2楼 · 编辑于 2024-04-19 03:40:24

您可以将DataFrame分配给自身的筛选版本：

df = df[df.score > 50]

这比drop快：

%%timeit
test = pd.DataFrame({'x': np.random.randn(int(1e6))})
test = test[test.x < 0]
# 54.5 ms ± 2.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
test = pd.DataFrame({'x': np.random.randn(int(1e6))})
test.drop(test[test.x > 0].index, inplace=True)
# 201 ms ± 17.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit
test = pd.DataFrame({'x': np.random.randn(int(1e6))})
test = test.drop(test[test.x > 0].index)
# 194 ms ± 7.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

网友

3楼 · 编辑于 2024-04-19 03:40:24

要直接回答这个问题的原始标题“如何根据条件表达式从pandas数据框中删除行”（我理解这不一定是OP的问题，但可以帮助其他用户遇到这个问题），一种方法是使用drop方法：

df = df.drop(some labels)

df = df.drop(df[<some boolean condition>].index)

示例

要删除列“score”为50的所有行，请执行以下操作：

df = df.drop(df[df.score < 50].index)

就地版本（如评论中指出的）

df.drop(df[df.score < 50].index, inplace=True)

多种情况

（见Boolean Indexing）

The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.

删除列“score”为50和20的所有行

df = df.drop(df[(df.score < 50) & (df.score > 20)].index)

相关问题更多 >

编程相关推荐

热门问题

热门文章