Regex：如何在域或IP地址前去掉string+空格？

from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03 by 10.66.156.198 with SMTP id wg6mr62843415pab.126.1433365924352; from localhost (localhost [127.0.0.1])

3条回答

网友
1楼 · 编辑于 2024-04-18 06:49:48

您可以使用re.MULTILINE标志来启用多行模式，以使行开头的某些文本与^匹配。要获得必要的文本，您必须使用捕获组。你知道吗
遗憾的是，Python regex不支持\K，也不支持可变宽度look behind（使用本机re库）。但是，可变宽度的look behind可以与^{}外部库一起使用。你知道吗
以下是您可以使用的示例代码：
import re p = re.compile(ur'^(?:by|from) (\S+)', re.MULTILINE) test_str = u"from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03\n\nby 10.66.156.198 with SMTP id wg6mr62843415pab.126.1433365924352;\n\nfrom localhost (localhost [127.0.0.1])" print [x.group(1) for x in re.finditer(p, test_str)]
a demo program的输出：
[u'mail2.oknotify2.com', u'10.66.156.198', u'localhost']

网友
2楼 · 编辑于 2024-04-18 06:49:48

您可以使用\K放弃以前的匹配：
(?:X-)?Received: (?:by|from) \K([\S]+)
见Demo
编辑：
正如@James Newton所说，这并不是所有regex风格都支持的，您可以参考这篇文章，看看您的引擎是否支持它：
https://stackoverflow.com/a/13543042/3393095
编辑2：
因为您指定了Python，所以只需在regex上使用捕获组和re.findall即可，如下所示：
>>> import re >>> text = ("Received: from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03\n" ... "Received: by 10.66.156.198 with SMTP id wg6mr62843415pab.126.1433365924352;\n" ... "Received: from localhost (localhost [127.0.0.1])") >>> re.findall(r'(?:X-)?Received: (?:by|from) ([\S]+)', text) ['mail2.oknotify2.com', '10.66.156.198', 'localhost']

网友
3楼 · 编辑于 2024-04-18 06:49:48

我之所以要写一个答案，是因为注释不允许格式化，但正确的答案由@stribizhev给出。你知道吗

@stribizhev提出了这个正则表达式：

^(?:by|from) (\S+)

(?:by|from)开头的?:使其成为非捕获组。(\S+)是一个捕获组。如果使用result = string.match(regex)，并且存在匹配项，则result将包含一个数组，例如["from mail2.oknotify2.com", "mail2.oknotify2.com"]。结果[1]的值是捕获的组。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章