找到字符串索引,然后反向查找正则表达式并删除

2024-05-19 01:08:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我的问题与之前在Python Reverse Find in String上发表的问题类似

以下是我的超长字符串示例:

t1 = '''1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019 pending 1281640 city west link rd lilyfield 02/10/2019 16/10/2019 - 16/11/2019 pending 1276160 victoria rd rozelle 25/09/2019 14/10/2019 - 15/10/2019 pending 1331626 31/12/2019 - 31/01/2020 incomplete n/a 1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019'''

更新日期:2020年2月1日

在放入数据帧之前,我将数据分组到列表中。我不想要任何与'incomplete n/a'相关的数据。我是否需要删除字符串,或者是否有一个正则表达式函数来识别其位置上的'incomplete n/a'和组

我想要两个输出:

ONE此列表t1L = ['1281674 ', '1281640 ', '1276160 ']。请注意,这不包括1331626

TWO此字符串将被拆分或重新定义(不包含1331626),例如:

t1 = '''1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019 pending 1281640 city west link rd lilyfield 02/10/2019 16/10/2019 - 16/11/2019 pending 1276160 victoria rd rozelle 25/09/2019 14/10/2019 - 15/10/2019 pending'''

谢谢你的帮助


Tags: the数据字符串citylinkrdt1west
3条回答

您需要2个正则表达式才能获得2个列表:

import re

t1 = '''1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019 pending 1281640 city west link rd lilyfield 02/10/2019 16/10/2019 - 16/11/2019 pending 1276160 victoria rd rozelle 25/09/2019 14/10/2019 - 15/10/2019 pending 1331626 31/12/2019 - 31/01/2020 incomplete n/a 1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019'''
clean = re.sub(r'\b\d{7}\b(?=(?:(?!\b\d{7}\b).)*incomplete n/a).*?$', '', t1)
print clean
res = re.findall(r'(\b\d{7}\b)', clean)
print res

输出:

1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019 pending 1281640 city west link rd lilyfield 02/10/2019 16/10/2019 - 16/11/2019 pending 1276160 victoria rd rozelle 25/09/2019 14/10/2019 - 15/10/2019 pending 
['1281674', '1281640', '1276160']

Demo & explanation

您可以使用循环和条件尝试以下代码

    import re
    t1 = '1281674 the crescent annandale 02/10/2019 16/10/2019 - 16/11/2019 pending 1281640 city west link rd lilyfield 02/10/2019 16/10/2019 - 16/11/2019 pending 1276160 victoria rd rozelle 25/09/2019 14/10/2019 - 15/10/2019 pending 1331626 31/12/2019 - 31/01/2020 incomplete n/a 1314832 '

    result = None
    for t in t1.split(" "):

        if re.match("\d{7}",t):
            result = t
        if 'incomplete' in t:
            break

print(result)

我认为你的问题有可行的代码new_str = t1[:t1.find(re.findall('\d{7}', t1[:t1.find('incomplete n/a')])[-1])])

相关问题 更多 >

    热门问题