返回带有|（pipe）特殊字符的单字名称的正则表达式是什么

3条回答

网友

1楼 · 编辑于 2024-06-09 08:49:15

对现有正则表达式模式进行简单的修改即可：

>>> name = """
|| John Deere
|| Stephen king
|| Steve
|| Barack Hussein Obama
|| Donald Trump 
|| Alan
|| Stewart"""
>>> re.findall('\| ([^\s]*)(?:\n|$)', name)
['Steve', 'Alan', 'Stewart']

您可以在输入字符串中使用re.findall查找所有匹配项

编辑：对于在名称之间包含|的已编辑输入，这可以：

>>> name = """| John | Gilbert | alan
| Stephen | king | harris
| | Steve
| Barack | | Obama
|| Donald | | Trump 
| | Alan
| | Stewart"""
>>> re.findall('^[|\W]*([^\s]+)(?:\n|$)', name, re.MULTILINE)
['Steve', 'Alan', 'Stewart']

网友

2楼 · 编辑于 2024-06-09 08:49:15

使用

(?m)^(?:\|[^\S\n]*)*(\S+)[^\S\n]*$

见proof

解释

--------------------------------------------------------------------------------
  (?m)                     multiline mode (= re.M / re.MULTILINE)
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \|                       '|'
--------------------------------------------------------------------------------
    [^\S\n]*                 any character except: non-whitespace
                             (all but \n, \r, \t, \f, and " "), '\n'
                             (newline) (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \S+                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (1 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  [^\S\n]*                 any character except: non-whitespace (all
                           but \n, \r, \t, \f, and " "), '\n'
                           (newline) (0 or more times (matching the
                           most amount possible))
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

Python code:

import re
string = """| John | Gilbert | alan
| Stephen | king | harris
| | Steve
| Barack | | Obama
|| Donald | | Trump 
| | Alan
| | Stewart"""
pattern = r"^(?:\|[^\S\n]*)*(\S+)[^\S\n]*$"
print(re.findall(pattern, string, re.M))

结果：['Steve', 'Alan', 'Stewart']

网友

3楼 · 编辑于 2024-06-09 08:49:15

您可以尝试将re.findall与模式(?:(?<=\n)|(?<=^))\|\s*\|\s*(\S+)(?:\n|$)一起使用，该模式只能找到单个单词名：

inp = """| John | Gilbert | alan
| Stephen | king | harris
| | Steve
| Barack | | Obama
|| Donald | | Trump 
| | Alan
| | Stewart"""

single_names = re.findall(r'(?:(?<=\n)|(?<=^))\|\s*\|\s*(\S+)(?:\n|$)', inp)
print(single_names)

这张照片是：

['Steve', 'Alan', 'Stewart']

相关问题更多 >

编程相关推荐

热门问题

热门文章

返回带有|（pipe）特殊字符的单字名称的正则表达式是什么

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >