Python正则表达式引擎-“look behind需要固定宽度的模式”E

编辑：

根据Tim的响应，我得到了多行字符串的以下输出

>>> str = """ "It "does "not "make "sense", Well, "Does "it" ... "It "does "not "make "sense", Well, "Does "it" ... "It "does "not "make "sense", Well, "Does "it" ... "It "does "not "make "sense", Well, "Does "it" """ >>> re.sub(r'\b\s*"(?!,|$)', '" "', str) ' "It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" " '

在每行的末尾，在“it”旁边添加了两个双引号。

所以我对regex做了一个很小的改动来处理一个新的行。

re.sub(r'\b\s*"(?!,|$)', '" "', str,flags=re.MULTILINE)

但这给了输出

>>> re.sub(r'\b\s*"(?!,|$)', '" "', str,flags=re.MULTILINE) ' "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it" " '

仅最后一个“it”就有两个双引号。

但是我想知道为什么‘$’行尾字符不能识别行尾。

一、二、二、三、三、四、四、四、四、四、四、四、六、六、六、六

最后的答案是

re.sub(r'\b\s*"(?!,|[ \t]*$)', '" "', str,flags=re.MULTILINE)

1条回答

网友

1楼 · 发布于 2024-05-29 11:46:42

Python lookbehind断言的宽度必须是固定的，但您可以尝试以下操作：

>>> s = '"It "does "not "make "sense", Well, "Does "it"'
>>> re.sub(r'\b\s*"(?!,|$)', '" "', s)
'"It" "does" "not" "make" "sense", Well, "Does" "it"'

说明：

\b      # Start the match at the end of a "word"
\s*     # Match optional whitespace
"       # Match a quote
(?!,|$) # unless it's followed by a comma or end of string

编辑：

相关问题更多 >

编程相关推荐

热门问题

热门文章