使用lxml etree解析函数时出现IOError
我有一个逻辑是这样的:
for root, dirs, files in os.walk(os.getcwd()):
if "info.xml" in files:
root = lxml.etree.parse("%s/info.xml" % root)
tag = root.xpath("/info/tagname")[0].text
当我尝试解析一个路径很深的 info.xml
文件时,遇到了错误信息:
Traceback (most recent call last):
File "/home/work/mergefile.py", line 365, in <module>
File "/home/work/mergefile.py", line 344, in merge_ejb_files
File "/home/work/mergefile.py", line 63, in __init__
File "/home/work/mergefile.py", line 78, in _parse_info2doc
File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71205)
File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:71488)
File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:70583)
File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:67736)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
File "parser.pxi", line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056)
IOError: Error reading file '/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml': failed to load external entity "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"
但是这个文件 "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"
是存在的,我在 ipython IDE 中可以用 lxml 成功解析它。
你知道问题出在哪里吗?如果你知道,请帮帮我!谢谢!
1 个回答
0
这是我的解决方案,跟我上面说的一样。我是先打开文件进行读取,然后马上关闭它们,这样就不会超过1024个文件的限制。
import lxml.etree as etree
for root,dirs,files in os.walk(os.getcwd()):
if "info.xml" in files:
with open('%s/info.xml'%root) as processfile: #use 'rb' if necessary
xml = etree.parse(processfile)
tag = root.xpath("/info/tagname")[0].text