如何使用python正则表达式计算文本中紧跟特殊字符的单词的出现次数

网友

1楼 · 编辑于 2024-04-20 16:10:15

people[?.!]

这将允许你只与人匹配？人。和/或人！你知道吗

因此，如果您再添加一些Counter(re.finall(，您就可以这样做了

#This will only match people
count[j] = Counter(re.findall(r'people\s' ,text))

#This will only match people?
count[j] = Counter(re.findall(r'people\?' ,text))

#This will only match people.
count[j] = Counter(re.findall(r'people\.' ,text))

#This will only match people!
count[j] = Counter(re.findall(r'people\!' ,text))

您需要使用\来转义特殊字符

此外，当您在试用python正则表达式时，这也是一个很好的资源：https://pythex.org/该站点还有一个正则表达式备忘单

网友

2楼 · 编辑于 2024-04-20 16:10:15

您可以在Regex模式的“people”部分的末尾使用修饰符语句。请尝试以下操作：

for j in range(len(paragraphs)):
    text = paragraphs[j].text
    count[j] = Counter(re.findall('r\bpeople[.?!]?\b', text)

那个？表示零个或多个量词。上面的模式似乎在regex101.com上可以使用，但我还没有在Python shell中试用过。你知道吗

网友

3楼 · 编辑于 2024-04-20 16:10:15

您可以在正则表达式中使用可选字符组：

r'\bpeople[.,!?]?\b'

那个？指定它可以出现0或1次[]指定允许的字符。无需转义.（或f.e.()*+?）内部的[]，尽管它们对regex有特殊的意义。如果要在[]中使用-，则需要对其进行转义，因为它用于表示集合中的范围[1-5]==12345。你知道吗

见：https://docs.python.org/3/library/re.html#regular-expression-syntax

[] Used to indicate a set of characters. In a set:
Characters can be listed individually, e.g. [amk] will match 'a', 'm', or 'k'. Ranges of characters can be indicated by giving two characters and separating them by a '-', for example [a-z] will match any lowercase ASCII letter, [0-5][0-9] will match all the two-digits numbers from 00 to 59, and [0-9A-Fa-f] will match any hexadecimal digit. [...]

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用python正则表达式计算文本中紧跟特殊字符的单词的出现次数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >