为什么ElementTree吃/忽略名称空间（在属性值中）？

<?xml version='1.0' encoding='UTF-8'?> <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema' xmlns='sdformat/pose' targetNamespace='sdformat/pose' xmlns:pose='sdformat/pose' xmlns:types='http://sdformat.org/schemas/types.xsd'> <xs:import namespace='sdformat/pose' schemaLocation='./pose.xsd'/> <xs:element name='pose' type='poseType' /> <xs:simpleType name='string'><xs:restriction base='xs:string' /></xs:simpleType> <xs:simpleType name='pose'><xs:restriction base='types:pose' /></xs:simpleType> <xs:complexType name='poseType'> <xs:simpleContent> <xs:extension base="pose"> <xs:attribute name='relative_to' type='string' use='optional' default=''> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:schema>

from xml.etree import ElementTree ElementTree.register_namespace("types", "http://sdformat.org/schemas/types.xsd") ElementTree.register_namespace("pose", "sdformat/pose") ElementTree.register_namespace("xs", "http://www.w3.org/2001/XMLSchema") tree = ElementTree.parse("test.xsd") tree.write("test_out.xsd")

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="sdformat/pose"> <xs:import namespace="sdformat/pose" schemaLocation="./pose.xsd" /> <xs:element name="pose" type="poseType" /> <xs:simpleType name="string"><xs:restriction base="xs:string" /></xs:simpleType> <xs:simpleType name="pose"><xs:restriction base="types:pose" /></xs:simpleType> <xs:complexType name="poseType"> <xs:simpleContent> <xs:extension base="pose"> <xs:attribute name="relative_to" type="string" use="optional" default=""> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:schema>

from xml.etree import ElementTree namespaces = { "types": "http://sdformat.org/schemas/types.xsd", "pose": "sdformat/pose", "xs": "http://www.w3.org/2001/XMLSchema" } for prefix, ns in namespaces.items(): ElementTree.register_namespace(prefix, ns) tree = ElementTree.parse("test.xsd") root = tree.getroot() queue = [tree.getroot()] while queue: element:ElementTree.Element = queue.pop() for value in element.attrib.values(): try: prefix, value = value.split(":") except ValueError: # no namespace, nothing to do pass else: if prefix == "xs": break # ignore XMLSchema namespace root.attrib[f"xmlns:{prefix}"] = namespaces[prefix] for child in element: queue.append(child) tree.write("test_out.xsd")

1条回答

网友

1楼 · 发布于 2024-04-28 05:45:51

这种行为有一个合理的原因，但它需要对XML模式概念有很好的理解

首先，一些重要事实：

您的XML文档不仅仅是任何旧的XML文档。这是一个XSD
XSD由模式描述（请参见schema for schema）
属性xs:restriction/@base不是xs:string。它的类型是xs:QName

基于上述事实，我们可以断言：

如果test.xsd被解析为XML文档，但不知道“schema for schema”，那么base属性的值将被视为字符串（技术上称为PCDATA）
如果使用验证XML解析器解析test.xsd，并将“schema for schema”作为xsd，则base属性的值将解析为xs:QName

当ElementTree写入输出XML时，它的行为应该取决于base的数据类型。如果base是一个QName，那么ElementTree应该检测到它正在使用名称空间前缀“types”，并且应该发出相应的名称空间声明

如果您在解析test.xsd时没有提供“schema for schema”，那么ElementTree就不受约束了，因为它不可能知道base应该被解释为QName

相关问题更多 >

编程相关推荐

热门问题

热门文章