使用正则表达式解析XML - 问答 - Python中文网

使用正则表达式解析XML

2024-04-25 19:55:16 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我想解析一些标签。你知道吗

模式是

<div id="tags">blah-blah<a href="http://url/tag">What_I_Want</a></div>

我以为有用

re.findall(">"."</a></div>")

但事实并非如此

怎么了？你知道吗

-----------更新I------------- 现在我知道re不擅长html。你知道吗

raj给我个答案

>>> from bs4 import BeautifulSoup
>>> s = '<div id="tags">blah-blah<a href="http://url/tag">What_I_Want</a></div>'
>>> soup = BeautifulSoup(s)
>>> soup.select('div > a:first')[0].text
'What_I_Want'

我还有一个问题。我怎么能找到

<div id blah blah </div>

在整个文件中？你知道吗

Tags： div re id http url tag tags 模式

2条回答

网友

1楼 · 编辑于 2024-04-25 19:55:16

简而言之：你不能

不同的简短回答：Python XML parser（它甚至有例子）

网友

2楼 · 编辑于 2024-04-25 19:55:16

似乎您正在尝试获取父标记div的直接子标记a的文本。你知道吗

>>> from bs4 import BeautifulSoup
>>> s = '<div id="tags">blah-blah<a href="http://url/tag">What_I_Want</a></div>'
>>> soup = BeautifulSoup(s)
>>> soup.select('div > a:first')[0].text
'What_I_Want'
>>> soup.select('div > a')[0].text
'What_I_Want'

相关问题更多 >

编程相关推荐

热门问题

热门文章