如何在Python/Django中使用XPATH将子树数据添加到主树中

1 投票
1 回答
551 浏览
提问于 2025-04-17 02:57

我正在使用etree来解析一个外部的xml文件,想从下面的xml文件中提取listing data,并把agency的数据添加到里面。我可以分别获取listingagency的数据,但不知道怎么把它们合并,这样listing才能得到正确的agency信息。

xml:

<response>
    <listing>
        <bathrooms>2.1</bathrooms>
        <bedrooms>3</bedrooms>
        <agency>
            <name>Bob's Realty</name>
            <phone>555-693-4356</phone>
        </agency>
    </listing>
    <listing>
        <bathrooms>3.1</bathrooms>
        <bedrooms>5</bedrooms>
        <agency>
            <name>Larry's Homes</name>
            <phone>555-324-6532</phone>
        </agency>
    </listing>
</response>

python:

tree = lxml.etree.parse("http://www.someurl.com?random=blahblahblah")
listings = tree.xpath("/response/listing")
agencies = tree.xpath("/response/listing/agency")

listings_info = []

for listing in listings:
    this_value = {
        "bedrooms":listing.findtext("bedrooms"),
        "bathrooms":listing.findtext("bathrooms"),
        }

        for agency in agencies:
            this_value['agency']= agency.findtext("name")


    listings_info.append(this_value)

我曾经在listing_info.append(this_value)之前加过一些代码,但这样做不对,只是把最后一个agency的值加到了每个listing上。

我把数据输出成json,结果是这样的(你可以看到一个agency的信息被放到了两个结果里):

    {"listings":[{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "2.1", "bedrooms": "3"},{"agency": "Bob's Realty", "phone":"555-693-4356" "bathrooms": "3.1", "bedrooms": "5"} ]}

我该如何在原来的for循环中把response/listing/agency的数据和response/listing的数据合并呢?

1 个回答

1

你可以在遍历你的列表时使用 listing.xpath('agency/name/text()')[0] 来获取该条信息的机构名称。

for listing in listings:
    this_value = {
        'bedrooms': listing.findtext('bedrooms'),
        'bathrooms': listing.findtext('bathrooms'),
        'agency': listing.xpath('agency/name/text()')[0]
    }
    listings_info.append(this_value)

撰写回答