从xmlfi解析单词

<SpeechSegment spkid="S0"> <Word dur="0.22" stime="0.44">oh</Word> <Word dur="0.27" stime="1.67">bedankt</Word> <Word dur="0.3" stime="2.03">voor</Word> <Word dur="0.53" stime="2.61">deelname</Word> <Word dur="0.22" stime="3.15">aan</Word> <Word dur="0.23" stime="3.39">de</Word> <Word dur="0.14" stime="6.15">want</Word> <Word dur="0.07" stime="6.29">ik</Word> <Word dur="0.09" stime="6.36">wil</Word> <Word dur="0.06" stime="6.45">je</Word> <Word dur="0.42" stime="6.51">graag</Word> <Word dur="0.2" stime="7.52">en</Word> </SpeechSegment>

2条回答

网友

1楼 · 编辑于 2024-06-16 11:19:03

使用^{}

解决方案：

import xml.etree.ElementTree as ET
root = ET.fromstring(xml_string)
required_list = [child.text for child in root]

网友

2楼 · 编辑于 2024-06-16 11:19:03

我不知道为什么findall('type')而XML不包含任何<type>元素。根据发布的XML，它应该是findall('Word')。下面是一个最小但完整的演示代码：

raw = '''<SpeechSegment spkid="S0">
    <Word dur="0.22" stime="0.44">oh</Word>
    <Word dur="0.27" stime="1.67">bedankt</Word>
    <Word dur="0.3" stime="2.03">voor</Word>
    <Word dur="0.53" stime="2.61">deelname</Word>
    <Word dur="0.22" stime="3.15">aan</Word>
    <Word dur="0.23" stime="3.39">de</Word>
    <Word dur="0.14" stime="6.15">want</Word>
    <Word dur="0.07" stime="6.29">ik</Word>
    <Word dur="0.09" stime="6.36">wil</Word>
    <Word dur="0.06" stime="6.45">je</Word>
    <Word dur="0.42" stime="6.51">graag</Word>
    <Word dur="0.2" stime="7.52">en</Word>
</SpeechSegment>'''

from xml.etree import ElementTree as ET
root = ET.fromstring(raw)
result = [word.text for word in root.findall('Word')]
print result

^{}

输出：

['oh', 'bedankt', 'voor', 'deelname', 'aan', 'de', 'want', 'ik', 'wil', 'je', 'graag', 'en']

相关问题更多 >

编程相关推荐

热门问题

热门文章