Python Beautiful Soup 选择文本

2 投票

2 回答

12409 浏览

提问于 2025-04-17 20:36

下面是我想要解析的HTML代码示例：

<html>
<body>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>

我正在使用beautiful soup来解析HTML代码，选择style8，具体方法如下（其中html是我HTTP请求的结果）：

html = result.read()
soup = BeautifulSoup(html)

content = soup.select('.style8')

在这个例子中，content变量返回了一个包含4个标签的列表。我想检查content.text，这个文本包含了每个style8类的内容。我需要查看列表中的每一项，看看是否包含Example，如果包含，就把它添加到一个变量里。如果遍历完整个列表后，Example没有出现，那么就把Not present添加到这个变量里。

到目前为止，我已经得到了以下内容：

foo = []

for i, tag in enumerate(content):
    if content[i].text == 'Example':
        foo.append('Example')
        break
    else:
        continue

这样做的话，只有在Example出现时才会把它添加到foo里，但如果在整个列表中没有出现Example，就不会把Not Present添加进去。

如果有任何方法可以做到这一点，我会很感激，或者如果有更好的方法来搜索整个结果，检查一个字符串是否存在，那就太好了。

数据提取 html解析 beautiful soup 字符串搜索文本选择标签遍历 http请求处理类内容检查

2 个回答

你可以使用 find_all() 方法来找到所有带有 class='style8' 的 td 元素，然后用列表推导式来创建一个叫 foo 的列表：

from bs4 import BeautifulSoup


html = """<html>
<body>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>"""

soup = BeautifulSoup(html)

foo = ["Example" if "Example" in node.text else "Not Present" 
       for node in soup.find_all('td', {'class': 'style8'})]
print foo

输出结果是：

['Example', 'Not Present', 'Not Present', 'Not Present']

回答于 2025-04-17 由 Python大师

分享举报

如果你只是想检查某个东西是否被找到，你可以使用一个简单的布尔标志，像这样：

foo = []
found = False
for i, tag in enumerate(content):
    if content[i].text == 'Example':
        found = True
        foo.append('Example')
        break
    else:
        continue
if not found:
    foo.append('Not Example')

如果我理解你的意思，这可能是个简单的方法，尽管alecxe的解决方案看起来很棒。

回答于 2025-04-17 由 Python大师

分享举报

Python Beautiful Soup 选择文本

2 个回答

撰写回答