在python中用序列号字符串替换pattern

import re text = "<h> This is a string. </H><p> This is another part. </P>" num_mat = re.findall(r"(?:<(\/*)[a-zA-Z0-9]+>)",text) print(str(len(num_mat))) reg = re.compile(r"(?:<(\/*)[a-zA-Z0-9]+>)",re.VERBOSE) phctr = 0 #for phctr in num_mat: # phtxt = "{" + str(phctr) + "}" phtxt = "{" + str(phctr) + "}" newtext = re.sub(reg,phtxt,text) print(newtext)

1条回答

网友

1楼 · 发布于 2024-06-09 23:17:39

import re
import itertools as it

text = "<h> This is a string. </H><p> This is another part. </P>"

cnt = it.count()
print re.sub(r"</?\w+>", lambda x: '{{{}}}'.format(next(cnt)), text)

印刷品

^{pr2}$

仅适用于简单标记（标记中没有属性/空格）。对于扩展标记，您必须调整regexp。在

另外，不重新初始化cnt = it.count()将继续编号。在

更新获取映射dict：

import re
import itertools as it

text = "<h> This is a string. </H><p> This is another part. </P>"

cnt = it.count()
d = {}
def replace(tag, d, cnt):
    if tag not in d:
        d[tag] = '{{{}}}'.format(next(cnt))
    return d[tag]
print re.sub(r"(</?\w+>)", lambda x: replace(x.group(1), d, cnt), text)
print d

印刷品：

{0} This is a string. {1}{2} This is another part. {3}
{'</P>': '{3}', '<h>': '{0}', '<p>': '{2}', '</H>': '{1}'}

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中用序列号字符串替换pattern

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >