靓汤解析Python

2024-06-06 01:12:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用BS4捕获了以下html,但似乎无法搜索艺术家标记。 我已经将这段代码分配给一个名为container的变量,然后尝试

print container.tr.td["artist"]

没有运气。 有什么建议吗?你知道吗

<tr class="item">
  <!-- <td class="image"><a href="https://www.stargreen.com/kool-as-the-gang-44415.html" title="KOOL AS THE GANG " class="product-image"><img src="https://www.stargreen.com/media/catalog/product/cache/1/small_image/135x/9df78eab33525d08d6e5fb8d27136e95/K/o/KoolAsTheGang.jpg" width="135" height="135" alt="KOOL AS THE GANG " /></a></td> -->
  <td class="date">Sat, 30 Dec 2017</td>
  <td class="artist">kool as the gang</td>
  <td class="venue">100 club</td>
  <td class="link">
  <p class="availability out-of-stock">
    <span>Off Sale</span></p>
  </td>
</tr>

Tags: thehttpsimagecomartistcontainerhtmlas
2条回答

语法错误,“artist”是“class”属性的值请尝试以下操作:

from bs4 import BeautifulSoup

html = """
<tr class="item">
<!  <td class="image"><a href="https://www.stargreen.com/kool-as-the-gang-44415.html" title="KOOL AS THE GANG " class="product-image"><img src="https://www.stargreen.com/media/catalog/product/cache/1/small_image/135x/9df78eab33525d08d6e5fb8d27136e95/K/o/KoolAsTheGang.jpg" width="135" height="135" alt="KOOL AS THE GANG " /></a></td>  >
<td class="date">Sat, 30 Dec 2017</td>
<td class="artist">
                        kool as the gang                     </td>
<td class="venue">100 club</td>
<td class="link">
<p class="availability out-of-stock">
<span>Off Sale</span></p>
</td>
</tr>
"""

soup = BeautifulSoup(html, 'html.parser')
td = soup.find('td',{'class': 'artist'})
print (td.text.strip())

输出:

kool as the gang

另一种方式。你知道吗

使用select方法在container中查找其class为'artist'的元素。因为可以有多个元素,但您知道只有一个,所以选择列表中唯一的元素,并请求其text属性。你知道吗

>>> HTML = open('sven.htm').read()
>>> import bs4
>>> container = bs4.BeautifulSoup(HTML, 'lxml')
>>> container.select('.artist')[0].text
'\n                        kool as the gang                     '

相关问题 更多 >