如何将字符串转换为美化组对象?

2024-03-29 06:08:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图抓取一个新闻网站,我需要改变一个参数。我改为替换为下一个代码:

while i < len(links):
    conn = urllib.urlopen(links[i])
    html = conn.read()
    soup = BeautifulSoup(html)
    t = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
    n = str(t.find("div", attrs={'class':'entry cuerpo-noticias'}))
    print(p)

问题是“t”类型是string,find with attributes只适用于类型<class 'BeautifulSoup.BeautifulSoup'>。你知道我怎样才能把“t”转换成那种类型吗?


Tags: 类型dflocalcontainerhtmllinksfindconn
1条回答
网友
1楼 · 发布于 2024-03-29 06:08:08

只需在解析前进行替换即可:

html = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
soup = BeautifulSoup(html, "html.parser")

注意,还可以(我甚至可以说preferred)解析HTML、定位元素并修改实例的属性,例如:

soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
    elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]

注意class是一个特殊的multi-valued attribute-这就是为什么我们要将值设置为单个类的列表。

演示:

from bs4 import BeautifulSoup

html = """
<div class="row bigbox container mi-df-local locked-single">test</div>
"""

soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
    elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]

print(soup.prettify())

现在看看div元素类是如何更新的:

<div class="row bigbox container mi-df-local single-local">
 test
</div>

相关问题 更多 >