当细分出一个一般表达式时,只想提取数字

2024-04-26 06:09:31 发布

您现在位置:Python中文网/ 问答频道 /正文

以下是示例文本:

initiated to address the deviation to SOP-020583v11.0 Section SOP-016248v2.0 john doe, john doe SOP-020583 fake text, this is all fake

理想情况下,文本应如下所示:

initiated to address the deviation to 020583 Section 016248 john doe, john doe 020583 fake text, this is all fake

以下是我目前掌握的代码:

def dashrepl(matchobj):
    print (type(matchobj))
    return re.findall('[0-9]',matchobj)

re.sub(SOP, dashrepl, long_desc_text[22])

但我得到了以下错误:

TypeError: expected string or buffer

编辑更新内容:

long_desc_text[22]

SOP-020583v11.0 Section 8.4.On 17Jan2016 at ATO Site, SOP-016248v2.0 was due for periodic review but the periodic SOP-016248 revision is not tied to any change control records. SOP-020583 tied to a change control record" and notified ID63718 notifiedID22359 of the event. SOP-020583v11.0, fake text fake text


Tags: thetotext文本isaddresssectionthis
1条回答
网友
1楼 · 发布于 2024-04-26 06:09:31

所以,这是我的密码:

import re

test = "initiated to address the deviation to SOP-020583v11.0 Section SOP-016248v2.0 john doe, john doe SOP-020583 fake text, this is all fake"

regexp = r"SOP-(\d+)(?:v\d+\.\d)?"

test = re.subn(regexp, r"\1", test)

print test[1]

它产生:
“着手解决020583第016248节的偏差,无名氏,无名氏020583假文本,这都是假的”

使用python的re函数“subn”查找并用指定的字符串替换模式的所有示例—在本例中是第一个捕获组。字符串前面的“r”将其指定为regex对象。你知道吗

作为参考,我还发现了这个link

希望这有帮助。你知道吗

相关问题 更多 >