基本pythonxml数据提取

<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0"> <name>GNEverybody</name> <description>The group for GNE players</description> <members>69</members> <privacy>3</privacy> <throttle count="10" mode="month" remaining="3" /> <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" /> </group>

3条回答

网友

1楼 · 编辑于 2024-05-16 21:30:34

简单XML

在这个问题上，你有解决办法：

同样的问题

XML parsing in Python

参考

http://docs.python.org/2/library/xml.dom.minidom

网友

2楼 · 编辑于 2024-05-16 21:30:34

从该项中提取元素需要稍微更详细的语法。在

>>> from lxml import html, etree
>>> example = etree.fromstring("""
<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0">
    <name>GNEverybody</name>
    <description>The group for GNE players</description>
    <members>69</members>
    <privacy>3</privacy>
    <throttle count="10" mode="month" remaining="3" />
    <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" />
</group>
""")

# Attributes can be accessed in two ways:
>>> example.attrib  # Returns a dictionary of key, value pairs
{'iconserver': '1', 'lang': 'en-us', 'ispoolmoderated': '0', 'id': '34427465497@N01', 'iconfarm': '1'}
>>> example.get('id')  # Grabs a specific key in the attribs dict.
'34427465497@N01'

# Children elements are accessed using the getchildren() method:
>>> example.getchildren()  # Returns a list of items.
[<Element name at 0x1007c7140>, <Element description at 0x1007c7190>, <Element members at 0x1007c71e0>, <Element privacy at 0x1007c7230>, <Element throttle at 0x1007c7280>, <Element restrictions at 0x1007c72d0>]

另一种提取子对象的方法是使用xpath：

^{pr2}$

访问元素描述的项就像父节点一样：

>>> desc = example.xpath(u'//description')
>>> desc[0].tag
'description'
>>> desc[0].attrib  # This node has no attributes.
{}

其他项目可能具有以下属性：

>>> example.xpath(u'//restrictions')[0].attrib
{'photos_ok': '1', 'images_ok': '1', 'safe_ok': '1', 'has_geo': '0', 'screens_ok': '1', 'videos_ok': '1', 'moderate_ok': '0', 'restricted_ok': '0', 'art_ok': '1'}

查看dir(example)，以获得可以在lxml.etree.Element上使用的方法的完整列表。在

网友

3楼 · 编辑于 2024-05-16 21:30:34

@VooDooNOFX的答案的一个变体是使用^{}

>>> group = lxml.objectify.fromstring("""<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0">
...     <name>GNEverybody</name>
...     <description>The group for GNE players</description>
...     <members>69</members>
...     <privacy>3</privacy>
...     <throttle count="10" mode="month" remaining="3" />
...     <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" />
... </group>""")
>>> group.get("id")
'34427465497@N01'
>>> group.name
'GNEverybody'
>>> group.restrictions.get("safe_ok")
'1'
>>>

相关问题更多 >

编程相关推荐

热门问题

热门文章