如何使用beautifulsoup刮取手机号码

email = soup(text=re.compile(r'[A-Za-z0-9\.\+_-]+@[A-Za-z0-9\._-]+\.[a-zA-Z]*')) _emailtokens = str(email).replace("\\t", "").replace("\\n", "").split(' ') if len(_emailtokens): print([match.group(0) for token in _emailtokens for match in [re.search(r"([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", str(token.strip()))] if match])

1条回答

网友

1楼 · 发布于 2024-04-23 10:55:29

假设您已经编写了一个刮板将您的数字字符串（移动和非移动）存储在列表中（在您的情况下，您很可能已经根据代码将数字拆分为一个列表），那么下面的代码片段（使用正则表达式）可能会对您有所帮助

代码

import re

#NXX-NXX-XXXX
#NXX 986 or 965
#N=digits 2–9, X=digits 0–9

#here is the regex pattern you need
pattern = r'(?=[2-9]{1}[0-9]{2}-[2-9]{1}[0-9]{2}-[0-9]{4}$)((?P<hello>986.+)|(?P<world>965.+))'

#Note: give your groups (986 and 965) a sensible name, I am using hello and world for demonstration

sent = ['986-233-8901', '965-345-8745', '123-456-7890', '986-134-5987', '1234', '$5@67^73']
#Matched, Matched, None, None, None, None

regexp = re.compile(pattern)

#the matched results
result = [regexp.match(item) for item in sent]
#change to regexp.search() if needed

#a way to retrieve group elements with prefix 986 (group hello)
hello_group = [item.group('hello') for item in result if item is not None]

输出

print(result)
#[<re.Match object; span=(0, 12), match='986-233-8901'>, <re.Match object; span=(0, 12), match='965-345-8745'>, None, None]

print(hello_group)
#['986-233-8901', None]

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用beautifulsoup刮取手机号码

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >