删除行首和行尾或仅行尾的字符

8 投票

5 回答

22027 浏览

提问于 2025-04-16 06:40

我想用正则表达式从一个字符串中去掉一些符号，比如：

==（出现在行的开头和结尾），

*（只出现在行的开头）。

def some_func():
    clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line.
    clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line.

我的代码哪里出错了？看起来表达式不对。如果一个字符或符号出现在行的开头或结尾（可能出现一次或多次），我该怎么去掉它呢？

正则表达式字符串处理模式匹配文本清理行首行尾

5 个回答

那么，不是替换而是保持？：

tu = ('======constellation==' , '==constant=====' ,
      '=flower===' , '===bingo=' ,
      '***seashore***' , '*winter*' ,
      '====***conditions=**' , '=***trees====***' , 
      '***=information***=' , '*=informative***==' )

import re
RE = '((===*)|\**)?(([^=]|=(?!=+\Z))+)'
pat = re.compile(RE)

for ch in tu:
    print ch,'  ',pat.match(ch).group(3)

结果：

======constellation==    constellation
==constant=====    constant
=flower===    =flower
===bingo=    bingo=
***seashore***    seashore***
*winter*    winter*
====***conditions=**    ***conditions=**
=***trees====***    =***trees====***
***=information***=    =information***=
*=informative***==    =informative***

你实际上想要的是

====***条件=** 是给条件=** 吗？

***====一百====*** 是给一百====*** 吗？

这是开始吗？**

回答于 2025-04-16 由 Python大师

分享举报

你的正则表达式里有多余的空格。即使是一个空格，也算是一个字符。

r'^(?:\*|==)|==$'

回答于 2025-04-16 由 Python大师

分享举报

如果你只想去掉字符串开头和结尾的字符，可以使用 string.strip() 方法。这样写出来的代码大概是这样的：

>>> s1 = '== foo bar =='
>>> s1.strip('=')
' foo bar '
>>> s2 = '* foo bar'
>>> s2.lstrip('*')
' foo bar'

strip 方法会把你指定的字符从字符串的开头和结尾去掉，lstrip 只会去掉开头的字符，而 rstrip 只会去掉结尾的字符。

如果你真的想用正则表达式，它们的写法大概是这样的：

clean = re.sub(r'(^={2,})|(={2,}$)', '', clean)
clean = re.sub(r'^\*+', '', clean)

不过我觉得，使用 strip/lstrip/rstrip 是最合适的选择，能满足你的需求。

补充：根据Nick的建议，这里有一个可以在一行内完成所有操作的解决方案：

clean = clean.lstrip('*').strip('= ')

一个常见的误区是认为这些方法会按照给定字符的顺序去除，实际上，参数只是一个要去掉的字符序列，无论顺序如何。因此，.strip('= ') 会把开头和结尾的所有 '=' 和 ' ' 都去掉，而不仅仅是字符串 '= '。

回答于 2025-04-16 由 Python大师

分享举报

删除行首和行尾或仅行尾的字符

5 个回答

撰写回答