正则表达式替换包含整个条目的括号

2 投票

4 回答

790 浏览

提问于 2025-04-18 03:06

我有一个用制表符分隔的文本文件。

1    (hi7 there)    my
2    (hi7)there    he3

我想要删除那些只包裹整个条目的括号（我不确定“条目”这个词用得对不对，反正就是这个意思）

所以输出应该是这样的

1    hi7 there    my
2    (hi7)there    he3

我知道我可以很容易地找到删除所有括号的方法。但我找不到只在括号包裹整个条目时删除它们的方法。

我可以简单地用Notepad++或者Python来做到这一点吗？哪个更快？

正则表达式文本处理字符串操作数据清洗 Notepad++ 制表符分隔

4 个回答

如果它们真的是用 制表符 分隔的，你可以把

\t\(([^\t]*)\)\t

\t           # a tab
\(           # an opening parenthesis
(            # open the capturing group
    [^\t]*   # anything but a tab
)
\)
\t

替换成

\t\1\t

这个方法的核心是抓住相关括号里的文字，然后在替换的时候用到这个文字，使用反向引用 \1。

可以查看这个演示。

回答于 2025-04-18 由 Python大师

分享举报

在Python的正则表达式中，你可以使用制表符 \t，这样你就可以像下面这样进行匹配：

>>> import re
>>> re.match('^\([^\t]+\)\t.*$', '(hi7 there)\tmy')
>>> <_sre.SRE_Match object at 0x02573950>
>>> re.match('^\([^\t]+\)\t.*$', '(hi7)there\tmy')
>>>

一旦你知道怎么匹配你的字符串，如果这一行符合条件，就很简单可以去掉括号。

回答于 2025-04-18 由 Python大师

分享举报

我觉得这个应该可以用

f = open("file.txt")
for line in f:
 l = line.strip().split("    ")
 for word in l:
  if word[0] == "(" and word[-1] == ")":
   print (word[1:len(word)-1]),
  else:
   print (word),
 print

用于覆盖

import fileinput

for line in fileinput.FileInput("file.txt", inplace=1):
    l = line.strip().split("    ")
    s = ""
    for sent in l:
        if sent[0] == "(" and sent[-1] == ")":
            s += sent[1:len(sent) - 1] + "    "
        else:
            s += sent + "    "
    print s[:-1]

回答于 2025-04-18 由 Python大师

分享举报

这个表达式似乎能正确处理所有情况：

(?m)     # multiline mode
(^|\t)   # start of line of field 
\(       # (
   ([^\t]+?) # anything but a tab
\)       # )
(?=      # followed by...
   $|\t  # end of line or field
)

用 \1\2 来替换。

举个例子：

import re

rx = r'(?m)(^|\t)\(([^\t]+?)\)(?=$|\t)'

txt = """
1   (hi7 (the)re)   (my)
2   (hi7)there  he3
(22)    (hi7)there  he3
(22)    (hi7there)  (he3)
"""

print re.sub(rx, r'\1\2', txt)

结果是：

1   hi7 (the)re my
2   (hi7)there  he3
22  (hi7)there  he3
22  hi7there    he3

回答于 2025-04-18 由 Python大师

分享举报

正则表达式替换包含整个条目的括号

4 个回答

撰写回答