使用any（）从字符串列表中标识匹配的字符串？

data = [[123131, "Bear Cat Apple Dog"], ['123131', "Cat Ap.ple Mouse"], ['231321', "Ap ple Bear"], ['231321', "Mouse Ap ple Dog"]] df = pd.DataFrame(data, columns = ['id', 'data']) def matching_function(m): matching_strings = ['Apple', 'Ap.ple', 'Ap ple'] if any(x in m for x in matching_strings): # do something to print the matched string return True df["matched"] = df['data'].apply(matching_function)

| id | data | matched | |--------|---------------------|---------| | 123131 | Bear Cat Apple Dog | TRUE | | 123131 | Cat Ap.ple Mouse | TRUE | | 231321 | Ap ple Bear | TRUE | | 231321 | Mouse Ap ple Dog | FALSE |

2条回答

网友

1楼 · 编辑于 2024-05-31 23:58:29

您可以使用以下模式检查Cat或Bear是否出现在感兴趣的单词之前，在本例中是Apple或Ap.ple或Ap ple

^(?:Cat|Bear).*Ap[. ]*ple|Ap[. ]*ple.*(?:Cat|Bear)

要创建满足条件的新dataframe列，可以组合map和df.str.match：

>>> df['matched'] = list(map(lambda m: "True" if m else "False", df['data'].str.match('^(?:Cat|Bear).*Ap[. ]*ple|Ap[. ]*ple.*(?:Cat|Bear)')))

或使用numpy.where：

>>> df['matched'] = numpy.where(df['data'].str.match('^(?:Cat|Bear).*Ap[. ]*ple|Ap[. ]*ple.*(?:Cat|Bear)'),'True','False')

将导致：

>>> df
       id                data matched
0  123131  Bear Cat Apple Dog    True
1  123131    Cat Ap.ple Mouse    True
2  231321         Ap ple Bear    True
3  231321    Mouse Ap ple Dog   False

网友

2楼 · 编辑于 2024-05-31 23:58:29

使用^{}从df['data']列中提取三个新列，即key、before&after，然后在每个{}&after列以查找单词前后的所有匹配项：

import re

keys = ['Apple', 'Ap.ple', 'Ap ple']
markers = ['Cat', 'Bear']

p =  r'(?P<before>.*?)' + r'(?P<key>' +'|'.join(rf'\b{re.escape(k)}\b' for k in keys) + r')' + r'(?P<after>.*)'
m = '|'.join(markers)

df[['before', 'key', 'after']] = df['data'].str.extract(p)
df['before'] = df['before'].str.findall(m)
df['after'] = df['after'].str.findall(m)

df['matched'] = df['before'].str.len().gt(0) | df['after'].str.len().gt(0)

# print(df)

       id                data       before     key   after  matched
0  123131  Bear Cat Apple Dog  [Bear, Cat]   Apple      []     True
1  123131    Cat Ap.ple Mouse        [Cat]  Ap.ple      []     True
2  231321         Ap ple Bear           []  Ap ple  [Bear]     True
3  231321    Mouse Ap ple Dog           []  Ap ple      []    False

相关问题更多 >

编程相关推荐

热门问题

热门文章