在Python中获取xml标签中的所有嵌套子元素

<html> <body> <c> <winforms> <type-conversion> <opacity> </opacity> </type-conversion> </winforms> </c> </body> </html> <html> <body> <css> <css3> <internet-explorer-7> </internet-explorer-7> </css3> </css> </body> </html> <html> <body> <c> <code-generation> <j> <visualj> </visualj> </j> </code-generation> </c> </body> </html>

1条回答

网友

1楼 · 发布于 2024-06-02 06:59:18

首先，XML规范只允许文档中有一个根元素。如果这是实际的XML，那么在解析之前需要用一个临时根元素包装它。在

现在，有了格式良好的XML，您可以使用xml.etree进行解析，并使用简单的XPath表达式.//body//*来查询<body>元素中的所有元素，无论是直接子元素还是嵌套元素：

from xml.etree import ElementTree as et

raw = '''xml string as posted in the question'''
root = et.fromstring('<root>'+raw+'</root>')

target_elements = root.findall('.//body/*')

result = [t.tag for t in target_elements]
print result
# output :
# ['c', 'winforms', 'type-conversion', 'opacity', 'css', 'css3', 'internet-explorer-7', 'c', 'code-generation', 'j', 'visualj']

相关问题更多 >

编程相关推荐

热门问题

热门文章