Python解析带转义方括号的xml

<?xml version="1.0" encoding="UTF-8"?> <Group id="RHEL-07-010010"> <title>SRG-OS-000257-GPOS-00098</title> <description><GroupDescription></GroupDescription> </description> <Rule id="RHEL-07-010010_rule" severity="high" weight="10.0"> <version>RHEL-07-010010</version> <title>The file permissions, ownership, and group membership of system files and commands must match the vendor values.</title> <description><VulnDiscussion>Discretionary access control is weakened if a user or group has access permissions to system files and directories greater than the default. Satisfies: SRG-OS-000257-GPOS-00098, SRG-OS-000278 GPOS-00108</VulnDiscussion> </Rule> </Group>

newtree = ET.fromstring(unescapedXML) File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions /2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML parser.feed(text) File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1640, in feed self._parser.Parse(data, 0) TypeError: must be string or read-only buffer, not Element

1条回答

网友

1楼 · 发布于 2024-06-08 01:28:57

我建议使用^{}而不是标准库的xml，这样会更加健壮和实用。它甚至可以自动取消文本中转义符号的转义。使用XPath也可以让你的生活更轻松。在

from lxml import etree as ET

xml = ET.XML(b"""<?xml version="1.0" encoding="UTF-8"?>
<Group id="RHEL-07-010010">
    <title>SRG-OS-000257-GPOS-00098</title>
    <description>&lt;GroupDescription&gt;&lt;/GroupDescription&gt;    </description>
    <Rule id="RHEL-07-010010_rule" severity="high" weight="10.0">
      <version>RHEL-07-010010</version>
      <title>The file permissions, ownership, and group membership of system files and commands must match the vendor values.</title>
      <description>&lt;VulnDiscussion&gt;Discretionary access control is weakened if a user or group has access permissions to system files and directories greater than the default.

Satisfies: SRG-OS-000257-GPOS-00098, SRG-OS-000278 GPOS-00108&lt;/VulnDiscussion&gt;
      </description>
   </Rule>
 </Group>""")

for description in xml.xpath('//description/text()'):
    vulnDiscussion = next(iter(ET.XML(description).xpath('/VulnDiscussion/text()')), None)
    print(vulnDiscussion)

上面的代码生成

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章