动态查找href标记

url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL' html = requests.get(url).text detail_tags_sector = BeautifulSoup(html, 'lxml') detail_tags_sector.find_all('a')

3条回答

网友

1楼 · 编辑于 2024-04-24 21:02:46

要从锚元素获取文本，需要访问每个锚元素上的.text变量
因此，您的代码将更改为：

url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
contents = []

html = requests.get(url).text    
detail_tags_sector = BeautifulSoup(html, 'html.paser')
for anchor in detail_tags_sector.find_all('a'):
    contents.append(anchor.text)
print(contents)

网友

2楼 · 编辑于 2024-04-24 21:02:46

这些答案的问题在于，它们收集了页面上所有链接的文本，并且有相当多的链接。如果只选择information technology字符串，则只需添加：

info = soup.select_one('[href*="sectors_in"]')
print(info.text)

输出：

Information Technology

网友

3楼 · 编辑于 2024-04-24 21:02:46

您可以使用以下任一选项。你知道吗

import requests
from lxml.html.soupparser import fromstring
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
html = requests.get(url).text
soup=fromstring(html)
findSearch = soup.xpath('//a[contains(text(), "Information Technology")]/text()')
print(findSearch[0])

或者

from bs4 import BeautifulSoup
from lxml import html
import requests
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'

html = requests.get(url).text
detail_tags_sector = BeautifulSoup(html, 'lxml')
for link in detail_tags_sector.find_all('a'):
    print(link.text)

或者

from bs4 import BeautifulSoup    
import requests
url = 'https://eresearch.fidelity.com/eresearch/goto/evaluate/snapshot.jhtml?symbols=AAPL'
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')
for link in soup.find_all('a'):
    print(link.text)

如果有帮助请告诉我。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

动态查找href标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >