如何在XML文件中查找特定标签并使用Python和minidom访问其父标签

<root> <article> <number> 0 </number> <DOI> 10.1016/B978-0-12-381015-1.00004-6 </DOI> <title> The patagonian toothfish biology, ecology and fishery. </title> <abstract> lots of abstract text </abstract> </article> <article> ...All the article tags as shown above... </article> </root>

2条回答

网友

1楼 · 编辑于 2024-05-16 11:28:01

迷你身份是必要条件吗？用lxml和XPath解析它会非常容易。在

from lxml import etree
datasource = open('/Users/philgw/Dropbox/PW-Honours-Project/Code/processed.xml').read()
tree = etree.fromstring(datasource)
path = tree.xpath("//article[DOI="10.1016/B978-0-12-381015-1.00004-6")

这将得到指定DOI的文章。在

另外，标记之间似乎有空格。我不知道这是否是因为Stackoverflow格式。这可能就是为什么你不能将它与minidom相匹配。在

网友

2楼 · 编辑于 2024-05-16 11:28:01

imho-只要在python文档中查找就行了！试试这个（未测试）：

from xml.dom import minidom

xmldoc = minidom.parse(datasource)   

def get_xmltext(parent, subnode_name):
    node = parent.getElementsByTagName(subnode_name)[0]
    return "".join([ch.toxml() for ch in node.childNodes])

matchingNodes = [node for node in xmldoc.getElementsByTagName("article")
           if get_xmltext(node, "DOI") == '10.1016/B978-0-12-381015-1.00004-6']

for node in matchingNodes:
    print "title:", get_xmltext(node, "title")
    print "abstract:", get_xmltext(node, "abstract")

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在XML文件中查找特定标签并使用Python和minidom访问其父标签

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >