如何在忽略特殊字符的字符串后找到下9个字符？

3条回答

网友

1楼 · 编辑于 2024-04-20 11:00:01

使用这个regex来识别模式。也许它能帮上忙：

import re

str_test = 'This is a sample text NRC234456789 and this is another case AZN.1.2.3.4.5.6.7.8.9 and this another case BSA 123 456 789 and final case SSR/789456123'
regex = re.findall("([A-Z0-9.\s\/]{2,})",str_test)
result = []

如果非数字字符只有点、逗号和斜杠，则有一种解决方案：

for r in regex:
    result.append(r.replace(".","").replace(" ","").replace("/",""))
print (result)

如果非数字字符可以是任何字符，则使用此循环：

for r in regex:
    result.append(re.sub("([^\d\w])","",r))
print (result)

输出：

['NRC234456789', 'AZN123456789', 'BSA123456789', 'SSR789456123']

已更新

import re

str_test = 'This is a sample text NRC234456789 and this is another case AZN.1.Z.3.4.S.6.7.8.9 and this another case BSA 123 456 789 and final case SSR/789456123'
regex = re.findall("([A-Z]{3})([A-Z0-9.\s\/]{2,})",str_test)
result = []
for r in regex:
    result.append(r[0]+("".join(re.sub("([^\d\w])","",str(r[1])).replace("Z","2").replace("S","5"))))

print (result)

输出：

['NRC234456789', 'AZN123456789', 'BSA123456789', 'SSR789456123']

网友
2楼 · 编辑于 2024-04-20 11:00:01

这是一种方法
例如：
import re str_test = 'This is a sample text NRC234456789 and this is another case AZN.1.2.3.4.5.6.7.8.9 and this another case BSA 123 456 789 and final case SSR/789456123' to_check = ['NRC', 'AZN', 'BSA', 'SSR'] pattern = re.compile("("+"|".join(to_check) + ")([\d+\.\s\/]+)") for k, v in pattern.findall(str_test): print(k + re.sub(r"[^\d]", "", v))
输出：
NRC234456789 AZN123456789 BSA123456789 SSR789456123
根据评论编辑。
import re str_test = 'This is a sample text NRC234456789 and this is another case AZN.1.Z.3.4.S.6.7.8.9 and this another case BSA 123 456 789 and final case SSR/789456123' to_check = ['NRC', 'AZN', 'BSA', 'SSR'] pattern = re.compile("("+"|".join(to_check) + ")([\d+\.\s\/ZS]+)") for k, v in pattern.findall(str_test): new_val = k + re.sub(r"[^\d]", "", v.replace("Z", "2").replace("S", "5")) print(new_val)

网友
3楼 · 编辑于 2024-04-20 11:00:01

下面是一个简单的方法，首先使用这个正则表达式找到想要的文本

\b(?:NRC|AZN|BSA|SSR)(?:.?\d)+

使用提供的列表动态生成，然后从中删除任何非字母数字字符。你知道吗

编辑： 对于2被错误地写为Z并且5被写为S的错误字符串，您可以在字符串的第二部分替换它们，忽略最初的三个字符。而且，代码更新了，所以它只选择下一个9位数，而不是更多。这是我更新的Python代码

import re

s = 'This is a sample text NRC234456789 and this is another case AZN.1.Z.3.4.S.6.7.8.9 and this another case BSA 123 456 789 and BSA 123 456 789 123 456 final case SSR/789456123'

list_comb = ['NRC', 'AZN', 'BSA', 'SSR']
regex = r'\b(?:{})(?:.?[\dA-Z])+'.format('|'.join(list_comb))
print(regex)

for m in re.findall(regex, s):
 m = re.sub(r'[^a-zA-Z0-9]+', '', m)
 mat = re.search(r'^(.{3})(.{9})', m)
 if mat:
  s1 = mat.group(1)
  s2 = mat.group(2).replace('S','5').replace('Z','2')
  print(s1+s2)

打印校正值，其中S替换为5，Z替换为2

NRC234456789
AZN123456789
BSA123456789
BSA123456789
SSR789456123

相关问题更多 >

编程相关推荐

热门问题

热门文章