使用bsObj python从网页导出标签名称

2024-06-16 18:11:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从网页中获取url目标的名称 这就是迄今为止我们所做的:

check ='https://www.zap.co.il/search.aspx?keyword='+'N3580-5092'
r = requests.get(check)
html = requests.get(r.url)
bsObj = BeautifulSoup(html.content,'xml')
storeName = bsObj.select_one('div.StoresLines div.BuyButtonsTxt')

结果是:

<div class="BuyButtonsTxt">
                ב-<a aria-label="לקנייה ב-פיסי אונליין Dell Inspiron 15 3580 
N3580-5092" href="/fs.aspx?pid=666473435&amp;sog=c-pclaptop" id="" 
target="_blank">פיסי אונליין</a>
</div>

我只需要href中的值:“פיסיאנ㪡ין” 怎么做


Tags: httpsdiv名称url网页目标gethtml
1条回答
网友
1楼 · 发布于 2024-06-16 18:11:29

我不得不将bsObj = BeautifulSoup(html.content,'xml')更改为bsObj = BeautifulSoup(html.content,'html.parser'),因为“xml”无法为我找到标记

from bs4 import BeautifulSoup 
import requests


check ='https://www.zap.co.il/search.aspx?keyword='+'N3580-5092'
r = requests.get(check)
html = requests.get(r.url)
bsObj = BeautifulSoup(html.content,'html.parser')
storeName = bsObj.select_one('div.StoresLines div.BuyButtonsTxt')



text = storeName.find('a').text

输出:

'פיסי אונליין'

相关问题 更多 >