提取子节点的极小域

2024-04-29 08:27:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一些XML:

<sentence id="1086415:2">
 <text>$6 and there is much tasty food, all of it fresh and continually refilled.</text>
  <Opinions>
   <Opinion to="31" from="27" polarity="positive" category="FOOD#STYLE_OPTIONS" target="food"/>
   <Opinion to="31" from="27" polarity="positive" category="FOOD#QUALITY" target="food"/>
   <Opinion to="31" from="27" polarity="positive" category="FOOD#PRICES" target="food"/>
  </Opinions>
</sentence>
<sentence id="1086415:3">
 <text>I am not a vegetarian but, almost all the dishes were great.</text>
  <Opinions>
   <Opinion to="48" from="42" polarity="positive" category="FOOD#QUALITY" target="dishes"/>
  </Opinions>

我试图提取意见标签中的所有内容,将其与元组中的文本相结合。我想知道我怎样才能用minidom做到这一点?当前意见返回'\n'

from xml.dom import minidom
xmldoc = minidom.parse("ABSA16_Restaurants_Train_SB1_v2.xml")
sentences = xmldoc.getElementsByTagName("sentence")
for sentence in sentences:
   text = sentence.getElementsByTagName("text")[0].firstChild.data
   opinion = sentence.getElementsByTagName("Opinions")[0].firstChild.data

谢谢你


Tags: totextfromidtargetfoodsentencecategory
1条回答
网友
1楼 · 发布于 2024-04-29 08:27:35

你确定你需要minidom

从文档中:

Users who are not already proficient with the DOM should consider using the xml.etree.ElementTree module for their XML processing instead.

如果没有充分的理由,请不要浪费时间,使用标准pythonxml.etree.ElementTree,它的手册中有足够的示例来解决您的任务。如果你觉得有什么问题,请随时发表评论

除此之外,如果您需要经常使用XMLs,我建议第三方lxml,它是更强大的工具,包括一些电池

相关问题 更多 >