如何使用dataname刮取标记?

2024-04-29 14:01:15 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,我正在努力刮一个网站

我需要从tagdiv data-name='dashboard-champ-content'获取一些数据

r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text, "lxml")

for i in soup.find_all('div')["dashboard-champ-content"]:
    print(i)  # it`s not working

那么,我该怎么做呢


1条回答
网友
1楼 · 发布于 2024-04-29 14:01:15

如果您只有一个标签:

from bs4 import BeautifulSoup

html_doc = """
    <div data-name='dashboard-champ-content'>
        This I want
    </div>"""

soup = BeautifulSoup(html_doc, "html.parser")
print(
    soup.find("div", {"data-name": "dashboard-champ-content"}).get_text(
        strip=True
    )
)

印刷品:

This I want

如果有多个标记:

from bs4 import BeautifulSoup

html_doc = """
    <div data-name='dashboard-champ-content'>
        This I want 1
    </div>
    <div data-name='dashboard-champ-content'>
        This I want 2
    </div>"""

soup = BeautifulSoup(html_doc, "html.parser")
for div in soup.find_all("div", {"data-name": "dashboard-champ-content"}):
    print(div.get_text(strip=True))

印刷品:

This I want 1
This I want 2

相关问题 更多 >