如何在Python/Django中使用XPATH将子树数据添加到主树中
我正在使用etree来解析一个外部的xml文件,想从下面的xml文件中提取listing data
,并把agency
的数据添加到里面。我可以分别获取listing
和agency
的数据,但不知道怎么把它们合并,这样listing
才能得到正确的agency
信息。
xml:
<response>
<listing>
<bathrooms>2.1</bathrooms>
<bedrooms>3</bedrooms>
<agency>
<name>Bob's Realty</name>
<phone>555-693-4356</phone>
</agency>
</listing>
<listing>
<bathrooms>3.1</bathrooms>
<bedrooms>5</bedrooms>
<agency>
<name>Larry's Homes</name>
<phone>555-324-6532</phone>
</agency>
</listing>
</response>
python:
tree = lxml.etree.parse("http://www.someurl.com?random=blahblahblah")
listings = tree.xpath("/response/listing")
agencies = tree.xpath("/response/listing/agency")
listings_info = []
for listing in listings:
this_value = {
"bedrooms":listing.findtext("bedrooms"),
"bathrooms":listing.findtext("bathrooms"),
}
for agency in agencies:
this_value['agency']= agency.findtext("name")
listings_info.append(this_value)
我曾经在listing_info.append(this_value)
之前加过一些代码,但这样做不对,只是把最后一个agency
的值加到了每个listing
上。
我把数据输出成json,结果是这样的(你可以看到一个agency
的信息被放到了两个结果里):
{"listings":[{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "2.1", "bedrooms": "3"},{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "3.1", "bedrooms": "5"} ]}
我该如何在原来的for
循环中把response/listing/agency
的数据和response/listing
的数据合并呢?
1 个回答
1
你可以在遍历你的列表时使用 listing.xpath('agency/name/text()')[0]
来获取该条信息的机构名称。
for listing in listings:
this_value = {
'bedrooms': listing.findtext('bedrooms'),
'bathrooms': listing.findtext('bathrooms'),
'agency': listing.xpath('agency/name/text()')[0]
}
listings_info.append(this_value)