在网站上根据我的名字进行采集，其中包含在网址中，但结果不正确。

from bs4 import BeautifulSoup import re import requests thesite = "http://www.peoplefinder.com/people-search/MT-Fname-Lname/" response = requests.get(thesite) soup = BeautifulSoup(response.text, 'html.parser') test = soup.findAll(text=re.compile('Fname Lname')) r = requests.get('http://www.peoplefinder.com/people-search/MT- Fname Lname') if 'Fname Lname' in r.text: print('Yes') else: print('No')

2条回答

网友

1楼 · 编辑于 2024-06-02 17:23:43

这并不是很简单。不过，我可以带你走一段路。你知道吗

我找了一个蒙大拿州不存在的名字，发现米利森特·哈考特填补了这个空缺。我这样做是因为这个网站的结果页面总是声称找到了一些匹配项。我需要看看结果页失败时会是什么样子，这样我就可以分析失败的页面。你知道吗

在这段代码中，我加载Millicent的结果并查找作为“matches”提供的名称。你知道吗

>>> import requests
>>> import bs4
>>> page = requests.get('https://www.ussearch.com/search/people/Millicent/~/Harcourt/MT').content
>>> soup = bs4.BeautifulSoup(page, 'lxml')
>>> links = soup.select('.memberTeaserName a')
>>> for link in links:
...     link.text.strip()
... 
'Michael Frank Harcourt'
'Michael C Harcourt'
'Maryjean  Harcourt'
'Mary L Harcourt'
'Mandy  Harcourt'

对我们人类来说，除了姓氏之外，没有一个是相似的。如果你不知道如何让自己满意地决定这些名字中是否有任何一个和米利森特·哈考特一样，那么这将是另一个问题的一个很好的主题。你知道吗

网友

2楼 · 编辑于 2024-06-02 17:23:43

所发生的情况是，您在r.text中搜索的名称仍然会出现-它将始终返回到结果页的标题中：

<title>Mt Fname Lname on PeopleFinder.com | Free People Search with Addresses and Phone Numbers</title>

即使这个人不存在。你需要找到其他更具体的东西来搜索。您需要找到一个更独特的html元素。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章