Python/Regex匹配。串

Test1.0,0.csv -> ('Test1', '0,0', 'csv') (Basic Example) Test2.wma -> ('Test2', 'wma') (No Match) Test3.1100,456.jpg -> ('Test3', '1100,456', 'jpg') (Basic with Large Number) T.E.S.T.4.5,6.png -> ('T.E.S.T.4', '5,6', 'png') (Doesn't strip all periods) Test5,7,8.sss -> ('Test5,7,8', 'sss') (No Match) Test6.2,3,4.png -> ('Test6.2,3,4', 'png') (No Match, to many commas) Test7.5,6.7,8.test -> ('Test7', '5,6', '7,8', 'test') (Double Match?)

3条回答

网友

1楼 · 编辑于 2024-06-16 09:15:05

您可以使用regex \.\d+,\d+\.来查找该模式的所有匹配项，但是您需要做一些额外的工作来获得预期的输出，特别是因为您希望将.5,6.7,8.视为两个匹配项。在

以下是一个潜在的解决方案：

def transform(s):
    s = re.sub(r'(\.\d+,\d+)+\.', lambda m: m.group(0).replace('.', '\n'), s)
    return tuple(s.split('\n'))

示例：

^{pr2}$

若要在没有匹配项时拆分文件扩展名，可以使用以下命令：

def transform(s):
    s = re.sub(r'(\.\d+,\d+)+\.', lambda m: m.group(0).replace('.', '\n'), s)
    groups = s.split('\n')
    groups[-1:] = groups[-1].rsplit('.', 1)
    return tuple(groups)

除了'Test2.wma'变为{}，这与上面的输出相同，'Test5,7,8.sss'和{}的行为相似。在

网友

2楼 · 编辑于 2024-06-16 09:15:05

'/^(.+)\.((\d+,\d+)\.)?(.+)$/'

第三个捕获组应该包含这对数字。如果你有多个这样的对，你应该得到多个匹配。第三次捕获总是包含这对。在

网友

3楼 · 编辑于 2024-06-16 09:15:05

要允许多个连续匹配，请使用lookahead/lookbehind：

r'(?<=\.)\d+,\d+(?=\.)'

示例：

^{pr2}$

我们还可以使用lookahead来执行拆分：

import re
def split_it(s):
    pieces = re.split(r'\.(?=\d+,\d+\.)', s)
    pieces[-1:] = pieces[-1].rsplit('.', 1) # split off extension
    return pieces

测试：

>>> print split_it('Test1.0,0.csv')
['Test1', '0,0', 'csv']
>>> print split_it('Test2.wma')
['Test2', 'wma']
>>> print split_it('Test3.1100,456.jpg')
['Test3', '1100,456', 'jpg']
>>> print split_it('T.E.S.T.4.5,6.png')
['T.E.S.T.4', '5,6', 'png']
>>> print split_it('Test5,7,8.sss')
['Test5,7,8', 'sss']
>>> print split_it('Test6.2,3,4.png')
['Test6.2,3,4', 'png']
>>> print split_it('Test7.5,6.7,8.test')
['Test7', '5,6', '7,8', 'test']

相关问题更多 >

编程相关推荐

热门问题

热门文章