用Beautiful Soup遍历时如何打印标签

0 投票

2 回答

1268 浏览

提问于 2025-04-16 09:17

我的xml文件长这样，我想获取里面的位置。

<?xml version="1.0" encoding="UTF-8"?>
<playlist version="1" xmlns="http://xspf.org/ns/0/">
 <trackList>
  <track>
   <location>file:///home/ashu/Music/Collections/randomPicks/ipod%20on%20sep%2009/Coldplay-Sparks.mp3</location>
   <title>Coldplay-Sparks</title>
  </track>
  <track>
   <location>file:///home/ashu/Music/Collections/randomPicks/gud%201s/Coldplay%20Warning%20sign.mp3</location>
   <title>Coldplay Warning sign</title>
  </track>....

我正在尝试这样做：

from BeautifulSoup import BeautifulSoup as bs
soup = bs (the_above_xml_text)
for track in soup.tracklist:
    print track.location.string

但是这样不行，因为我得到了：

AttributeError: 'NavigableString' object has no attribute 'location'

我该怎么才能实现这个结果呢，提前谢谢大家。

数据提取网页抓取 beautiful soup xml 解析标签遍历

2 个回答

使用 lxml，它更快，并且支持 xpath：

>>> doc = lxml.etree.fromstring(yourxml)
>>> doc.xpath('//n:location/text()', namespaces={'n': 'http://xspf.org/ns/0/'})
['file:///home/ashu/Music/Collections/randomPicks/ipod%20on%20sep%2009/Coldplay-Sparks.mp3',
'file:///home/ashu/Music/Collections/randomPicks/gud%201s/Coldplay%20Warning%20sign.mp3']

回答于 2025-04-16 由 Python大师

分享举报

你可以使用 findAll 方法：

>>> for track in soup.findAll('track'):
...     print track.title.string
...     print track.location.string
... 
Coldplay-Sparks
file:///home/ashu/Music/Collections/randomPicks/ipod%20on%20sep%2009/Coldplay-Sparks.mp3
Coldplay Warning sign
file:///home/ashu/Music/Collections/randomPicks/gud%201s/Coldplay%20Warning%20sign.mp3

回答于 2025-04-16 由 Python大师

分享举报

用Beautiful Soup遍历时如何打印标签

2 个回答

撰写回答