使用str.contains时，是否有排除特定子字符串的方法？

2条回答

网友
1楼 · 编辑于 2024-04-26 06:23:49

还可以包括排除值的条件。实现将是这样的。不过，它的实现成本有点高
import pandas as pd raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Chris Cook'], 'age': [20, 19, 18], 'favorite_food': ['Cake', 'Pancake', 'Ice Cream']} df = pd.DataFrame(raw_data) new_df = df[df['favorite_food'].str.contains('cake', na=False, case=False) & ~df['favorite_food'].isin(['Pancake'])] print ('raw-data df') print (df) print ('\nfiltered df for cake') print (new_df)
其输出将为：
raw-data df name age favorite_food 0 Willard Morris 20 Cake 1 Al Jennings 19 Pancake 2 Chris Cook 18 Ice Cream filtered df for cake name age favorite_food 0 Willard Morris 20 Cake

网友
2楼 · 编辑于 2024-04-26 06:23:49

我能想到的一件事是用''替换该特定字符串
exclude_words = ['pancake', 'cakefake'] df[df['title'].replace(exclude_words,'', regex=True) .str.contains('cake', case=False) ]
如果您有一个要排除的单词列表（如上图所示），那么这种方法将工作得更好，因为您不需要控制cake在单词中的相对位置
或者，如果只有一个'pancake'字，则使用否定查找来简化语法：
df[df['title'].str.contains('(?<!pan)cake')]
测试数据：
df = pd.DataFrame({'title':['cheesecake', 'pancake','no cake']})
输出：
title 0 cheesecake 2 no cake

背景

问题

问题

相关问题更多 >

编程相关推荐

热门问题

热门文章