如何在XPath中转义正斜杠？

2 投票

1 回答

3079 浏览

提问于 2025-04-17 05:29

我该如何在xpath查询中处理斜杠字符？我的标签里包含一个网址，所以我需要这样做。我在用Python的lxml库。

另外，xpath能否查询路径中的子字符串？下面有一些例子：

xml="""
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007">
  <gsa:content name="reportName">bbb</gsa:content>
  <gsa:content name="collectionName">default_collection</gsa:content>
  <gsa:content name="reportDate">date_3_25_2009</gsa:content>
 </entry>
"""

当我运行以下代码时：

tree=fromstring(xml)
for elt in tree.xpath('//*'):
    elt.tag

它返回：

'{http://www.w3.org/2005/Atom}entry'
'{http://schemas.google.com/gsa/2007}content'
'{http://schemas.google.com/gsa/2007}content'
'{http://schemas.google.com/gsa/2007}content'

运行 tree.xpath('/entry') 返回的是一个空列表。

我需要能够查询'{http://www.w3.org/2005/Atom}entry'作为标签，或者在标签的任何地方查询'entry'。

lxml xpath 子字符串标签查询正斜杠查询路径

1 个回答

可以看看这个关于 命名空间前缀^[文档] 的内容。

如果你想找到一个属于 http://schemas.google.com/gsa/2007 这个命名空间的元素，你需要这样搜索：

import lxml.etree as et

xml="""
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gsa="http://schemas.google.com/gsa/2007">
  <gsa:content name="reportName">bbb</gsa:content>
  <gsa:content name="collectionName">default_collection</gsa:content>
  <gsa:content name="reportDate">date_3_25_2009</gsa:content>
 </entry>
"""

NS = {'rootns': 'http://www.w3.org/2005/Atom',
      'gsa': 'http://schemas.google.com/gsa/2007'}

tree = et.fromstring(xml)

for el in tree.xpath('//gsa:content', namespaces=NS):
    print el.attrib['name']

print len(tree.xpath('//rootns:entry', namespaces=NS))

回答于 2025-04-17 由 Python大师

分享举报

如何在XPath中转义正斜杠？

1 个回答

撰写回答