如何使用python和beauthoulsoup获取title属性？

2条回答

网友

1楼 · 编辑于 2024-05-19 21:14:32

lxml库通常也很有用，因为它可以使用xpath表达式识别HTML结构，这样可以生成更紧凑的代码。在

在本例中，xpath表达式//td[@title]请求所有td元素，但坚持title属性存在。在for循环中，您可以看到不需要检查属性是否存在，因为这已经完成了。在

>>> from io import StringIO
>>> HTML = StringIO('''\
... <td title="title 1" role="gridcell"><a onclick="open" href="#">TEXT</a></td>
... <td role="gridcell"><a onclick="open" href="#">TEXT</a></td>
... <td title="title 2" role="gridcell"><a onclick="open" href="#">TEXT</a></td>
... <td title="title 3" role="gridcell"><a onclick="open" href="#">TEXT</a></td>''')
>>> parser = etree.HTMLParser()
>>> tree = etree.parse(HTML, parser)
>>> tds = tree.findall('//td[@title]')
>>> tds
[<Element td at 0x7a0888>, <Element td at 0x7a0d08>, <Element td at 0x7ae588>]
>>> for item in tree.findall('//td[@title]'):
...     item.attrib['title']
...     
'title 1'
'title 2'
'title 3'

网友

2楼 · 编辑于 2024-05-19 21:14:32

要获取元素的属性，可以将元素视为字典（dictionary（reference）：

soup.find('tag_name')['attribute_name']

在你的情况下：

^{pr2}$

注意，我使用了.get()方法来避免在没有title属性的td元素上失败。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用python和beauthoulsoup获取title属性？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >