Python正则表达式引擎-“look behind需要固定宽度的模式”E

2024-05-29 11:46:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试处理CSV格式的字符串中未匹配的双引号。

准确地说

"It "does "not "make "sense", Well, "Does "it"

应更正为

"It" "does" "not" "make" "sense", Well, "Does" "it"

所以基本上我要做的就是

replace all the ' " '

  1. Not preceded by a beginning of line or a comma (and)
  2. Not followed by a comma or an end of line

with ' " " '

为此,我使用下面的regex

(?<!^|,)"(?!,|$)

问题是当Ruby正则表达式引擎(http://www.rubular.com/)能够解析正则表达式时,python正则表达式引擎(https://pythex.org/http://www.pyregex.com/)抛出以下错误

Invalid regular expression: look-behind requires fixed-width pattern

对于Python2.7.3,它抛出

sre_constants.error: look-behind requires fixed-width pattern

有人能告诉我Python在这里的烦恼吗?

一、二、二、三、三、四、四、四、四、四、四、四、六、六、六、六

编辑:

根据Tim的响应,我得到了多行字符串的以下输出

>>> str = """ "It "does "not "make "sense", Well, "Does "it"
... "It "does "not "make "sense", Well, "Does "it"
... "It "does "not "make "sense", Well, "Does "it"
... "It "does "not "make "sense", Well, "Does "it" """
>>> re.sub(r'\b\s*"(?!,|$)', '" "', str)
' "It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" "\n"It" "does" "not" "make" "sense", Well, "Does" "it" " '

在每行的末尾,在“it”旁边添加了两个双引号。

所以我对regex做了一个很小的改动来处理一个新的行。

re.sub(r'\b\s*"(?!,|$)', '" "', str,flags=re.MULTILINE)

但这给了输出

>>> re.sub(r'\b\s*"(?!,|$)', '" "', str,flags=re.MULTILINE)
' "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it"\n... "It" "does" "not" "make" "sense", Well, "Does" "it" " '

仅最后一个“it”就有两个双引号。

但是我想知道为什么‘$’行尾字符不能识别行尾。

一、二、二、三、三、四、四、四、四、四、四、四、六、六、六、六

最后的答案是

re.sub(r'\b\s*"(?!,|[ \t]*$)', '" "', str,flags=re.MULTILINE)

Tags: 字符串remakenotitflagsmultilinewell
1条回答
网友
1楼 · 发布于 2024-05-29 11:46:42

Python lookbehind断言的宽度必须是固定的,但您可以尝试以下操作:

>>> s = '"It "does "not "make "sense", Well, "Does "it"'
>>> re.sub(r'\b\s*"(?!,|$)', '" "', s)
'"It" "does" "not" "make" "sense", Well, "Does" "it"'

说明:

\b      # Start the match at the end of a "word"
\s*     # Match optional whitespace
"       # Match a quote
(?!,|$) # unless it's followed by a comma or end of string

相关问题 更多 >

    热门问题