从.tx解析Python字符串

网友

1楼 · 编辑于 2024-04-25 12:17:36

(?<=[\[,])\s*(\d+ HMDB0+\d+)

使用re.findall相反，看到了吗演示。你知道吗

https://regex101.com/r/eS7gD7/19#python

import re
p = re.compile(r'(?<=[\[,])\s*(\d+ HMDB0+\d+)', re.IGNORECASE | re.MULTILINE)
test_str = "}# => 2[1 HMDB00001 ,2 HMDB00002]\n}# => 5[1 HMDB00001 ,2 HMDB00002, 3 HMDB00003 ,4 HMDB00004,5 HMDB00005]\n}# => 1[1 HMDB00001]"

re.findall(p, test_str)

网友

2楼 · 编辑于 2024-04-25 12:17:36

假设您的模式正好是：一个数字，一个空格，HMDB，5个数字，按顺序排列。你知道吗

结果存储在每行的dict中。你知道吗

import re

matches = {}
with open('my_text_file.txt', 'r') as f:
    for num, line in enumerate(f):
        matches.update({num: re.findall(r'\d\sHMDB\d{5}', line)})

print(matches)

如果HMDB可能不同，可以使用r'\d\s[a-zA-Z]{4}\d{5}'。你知道吗

网友

3楼 · 编辑于 2024-04-25 12:17:36

这似乎有效，但鉴于你的问题很难确定。你也许能从你得到的答案中拼凑出一个解决方案。你知道吗

import re

strings = [
    '}# => 2[1 HMDB00001 ,2 HMDB00002]',
    '}# => 5[1 HMDB00001 ,2 HMDB00002, 3 HMDB00003 ,4 HMDB00004,5 HMDB00005]',
    '}# => 1[1 HMDB00001]',
]

for s in strings:
    mat = re.search(r'\[(.*)\]', s)
    elements = map(str.strip, mat.group(1).split(','))
    print elements

输出：

['1 HMDB00001', '2 HMDB00002']
['1 HMDB00001', '2 HMDB00002', '3 HMDB00003', '4 HMDB00004', '5 HMDB00005']
['1 HMDB00001']

相关问题更多 >

编程相关推荐

热门问题

热门文章

从.tx解析Python字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >