从pythonPandas的dataframe列中搜索匹配的字符串模式

2条回答

网友

1楼 · 编辑于 2024-06-01 01:25:40

可能是这种结构：

    pd.DataFrame[DataFrame['columnName'].str.contains(re.compile('regex_pattern'))]

网友

2楼 · 编辑于 2024-06-01 01:25:40

我认为您可以将\添加到regex中以进行转义，因为|而不使用\被解释为^{}：

'|'
A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].

print df['genre'].str.contains(u'\|IC\|')
0     True
1    False
2    False
3    False
4     True
5     True
Name: genre, dtype: bool

print df[df['genre'].str.contains(u'\|IC\|')]
    name                        genre
0  satya            |ACTION|DRAMA|IC|
4    def  |DISCOVERY|SPORT|COMEDY|IC|
5    ghj                         |IC|

相关问题更多 >

编程相关推荐

热门问题

热门文章

从pythonPandas的dataframe列中搜索匹配的字符串模式

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >