基本pythonxml数据提取

2024-04-16 19:14:58 发布

您现在位置:Python中文网/ 问答频道 /正文

最近几天我一直在经历严重的大脑放屁,虽然我确信几个月前我就可以这样做,但我完全不知道如何从这个输出中提取数据元素;(

Grp = flickr.groups_getInfo( group_id = gid )返回的值给出

玻璃钢=

<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0">
    <name>GNEverybody</name>
    <description>The group for GNE players</description>
    <members>69</members>
    <privacy>3</privacy>
    <throttle count="10" mode="month" remaining="3" />
    <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" />
</group>

要提取单个数据元素,应该是:

^{pr2}$

等等?在


Tags: 数据nameid元素groupokdescriptionflickr
3条回答

简单XML

在这个问题上,你有解决办法:

同样的问题

XML parsing in Python

参考

http://docs.python.org/2/library/xml.dom.minidom

从该项中提取元素需要稍微更详细的语法。在

>>> from lxml import html, etree
>>> example = etree.fromstring("""
<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0">
    <name>GNEverybody</name>
    <description>The group for GNE players</description>
    <members>69</members>
    <privacy>3</privacy>
    <throttle count="10" mode="month" remaining="3" />
    <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" />
</group>
""")

# Attributes can be accessed in two ways:
>>> example.attrib  # Returns a dictionary of key, value pairs
{'iconserver': '1', 'lang': 'en-us', 'ispoolmoderated': '0', 'id': '34427465497@N01', 'iconfarm': '1'}
>>> example.get('id')  # Grabs a specific key in the attribs dict.
'34427465497@N01'

# Children elements are accessed using the getchildren() method:
>>> example.getchildren()  # Returns a list of items.
[<Element name at 0x1007c7140>, <Element description at 0x1007c7190>, <Element members at 0x1007c71e0>, <Element privacy at 0x1007c7230>, <Element throttle at 0x1007c7280>, <Element restrictions at 0x1007c72d0>]

另一种提取子对象的方法是使用xpath:

^{pr2}$

访问元素描述的项就像父节点一样:

>>> desc = example.xpath(u'//description')
>>> desc[0].tag
'description'
>>> desc[0].attrib  # This node has no attributes.
{}

其他项目可能具有以下属性:

>>> example.xpath(u'//restrictions')[0].attrib
{'photos_ok': '1', 'images_ok': '1', 'safe_ok': '1', 'has_geo': '0', 'screens_ok': '1', 'videos_ok': '1', 'moderate_ok': '0', 'restricted_ok': '0', 'art_ok': '1'}

查看dir(example),以获得可以在lxml.etree.Element上使用的方法的完整列表。在

@VooDooNOFX的答案的一个变体是使用^{}

>>> group = lxml.objectify.fromstring("""<group id="34427465497@N01" iconserver="1" iconfarm="1" lang="en-us" ispoolmoderated="0">
...     <name>GNEverybody</name>
...     <description>The group for GNE players</description>
...     <members>69</members>
...     <privacy>3</privacy>
...     <throttle count="10" mode="month" remaining="3" />
...     <restrictions photos_ok="1" videos_ok="1" images_ok="1" screens_ok="1" art_ok="1" safe_ok="1" moderate_ok="0" restricted_ok="0" has_geo="0" />
... </group>""")
>>> group.get("id")
'34427465497@N01'
>>> group.name
'GNEverybody'
>>> group.restrictions.get("safe_ok")
'1'
>>> 

相关问题 更多 >