dateTime在XSD验证中抱怨空白字符(lxml)
我正在尝试用XSD来验证一个文档,但lxml在处理日期时间值时对空格有意见(虽然它应该会自动处理这些空格)。我不确定这是个错误,还是我在XSD中指定的内容有问题。我花了一个小时在调试这个问题,所以希望有人之前遇到过类似的情况。
======================================================================
ERROR [0.076s]: test_exports (disqus.importer.tests.tests.SchemaValidation)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/dcramer/Development/disqus/disqus/importer/tests/tests.py", line 1098, in test_exports
xsd.assertValid(export)
File "lxml.etree.pyx", line 2659, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:99498)
DocumentInvalid: Element '{http://disqus.com}createdAt': '
2008-06-10T01:32:08
' is not a valid value of the atomic type 'xs:dateTime'., line 8
示例XML:
<?xml version="1.0" encoding="utf-8"?>
<disqus xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd">
<post dsq:id="1">
<id />
<message>
<![CDATA["We want happy paintings. Happy paintings. If you want sad things, watch the news."]]>
</message>
<createdAt>
2008-06-10T01:32:08
</createdAt>
<author>
<email>
bob@ross.com
</email>
<name>
bobross
</name>
<isAnonymous>
true
</isAnonymous>
<username>
bobross
</username>
</author>
<ipAddress>
127.0.0.1
</ipAddress>
<thread dsq:id="1"/>
</post>
</disqus>
disqus.xsd:
<?xml version="1.0"?>
<xs:schema targetNamespace="http://disqus.com"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dsq="http://disqus.com/disqus-internals"
xmlns="http://disqus.com"
elementFormDefault="qualified"
>
<!-- import the dsq namespace -->
<xs:import namespace="http://disqus.com/disqus-internals"
schemaLocation="internals.xsd"/>
<!-- misc types -->
<xs:simpleType name="identifier">
<xs:restriction base="xs:string">
<xs:maxLength value="200"/>
</xs:restriction>
</xs:simpleType>
<!-- root disqus element -->
<xs:element name="disqus">
<xs:complexType>
<xs:sequence>
<xs:element name="category" type="category" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="thread" type="thread" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="post" type="post" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<!-- category element -->
<xs:complexType name="category">
<xs:all minOccurs="0">
<xs:element name="forum" type="xs:string">
<xs:unique name="categoryID">
<xs:selector xpath="category"/>
<xs:field xpath="@title"/>
</xs:unique>
</xs:element>
<xs:element name="title" type="xs:string"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- thread element -->
<xs:complexType name="thread">
<xs:all minOccurs="0">
<xs:element name="id" type="identifier" minOccurs="0">
<xs:unique name="threadID">
<xs:selector xpath="thread"/>
<xs:field xpath="@id"/>
</xs:unique>
</xs:element>
<xs:element name="forum" type="xs:string"/>
<xs:element name="category">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="link" type="xs:anyURI"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="message" type="xs:string" minOccurs="0"/>
<xs:element name="author" type="author" minOccurs="0"/>
<xs:element name="createdAt" type="xs:dateTime"/>
<xs:element name="isClosed" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- post element -->
<xs:complexType name="post">
<xs:all minOccurs="0">
<xs:element name="id" type="identifier" minOccurs="0">
<xs:unique name="postID">
<xs:selector xpath="post"/>
<xs:field xpath="@id"/>
</xs:unique>
</xs:element>
<xs:element name="parent" minOccurs="0">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="identifier">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="thread">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="identifier">
<xs:attribute ref="dsq:id"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="author" type="author" minOccurs="0"/>
<xs:element name="message" type="xs:string"/>
<xs:element name="ipAddress" type="xs:string" minOccurs="0"/>
<xs:element name="createdAt" type="xs:dateTime"/>
<!-- post boolean states states -->
<xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isApproved" type="xs:boolean" default="true" minOccurs="0"/>
<xs:element name="isFlagged" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isSpam" type="xs:boolean" default="false" minOccurs="0"/>
<xs:element name="isHighlighted" type="xs:boolean" default="false" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
<!-- author element -->
<xs:complexType name="author">
<xs:all minOccurs="0">
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="link" type="xs:anyURI" minOccurs="0"/>
<xs:element name="username" type="xs:string" minOccurs="0"/>
<xs:element name="isAnonymous" type="xs:boolean" default="true" minOccurs="0"/>
</xs:all>
<xs:attribute ref="dsq:id"/>
</xs:complexType>
</xs:schema>
1 个回答
1
看起来空格可能是导致问题的原因。你能把 createdAt 前后的空格去掉吗?这样就变成了
<createdAt>2008-06-10T01:32:08</createdAt>
然后看看会发生什么?如果这样解决了问题,而且你是生成 XML 的话,那就改一下 XML 的生成方式,确保里面没有空格。否则,如果你负责这个模式的话,可以试着把 xsd:whitespace 改成“collapse”,看看能不能解决问题。
另外一种可能是需要时区信息。它应该符合 [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm] 的格式,时区是可选的,但你可以试着加一个 'Z' 看看能不能解决问题。这是 这篇帖子 提出的建议。