迭代部分结果以构建没有列表的JSON

2024-04-27 05:02:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在用Scrapy构建一个spider,显然我的Python技能还不够

我想构建一个完全没有列表的JSON,但是由于我正在抓取的页面有几个“room type”作为一个“room name”,我最终得到了列表

现在这个代码…:

def parse(self, response):
    for romtyper in response.selector.xpath(".//div[@class='room__collapsable']"):
            fradato = romtyper.xpath("//input[@type='hidden' and @name='fromDate']/@value").extract_first()
            personer = romtyper.xpath("//*[@id='booking-widget-guest-count-hotelnav-widget']/span/ng-pluralize/text()").extract_first()
            romnavn = romtyper.xpath(".//h2[@class='room__heading-level1']/text()[1]").extract_first()

            for prisboks in response.selector.xpath(".//div[@class='room__rates l-price-box l-price-box--selectable']"):
                romtype = prisboks.xpath(".//h3[@class='room-price-info__rate']/text()").extract_first()
                rompris = prisboks.xpath(".//span[@class='price']/text()").extract_first()
                yield {"fradato": fradato, "personer": personer, "romnavn": romnavn, "romtype": romtype, "rompris": rompris}

。。。只提供一种房间类型(和价格)。如果我跳到extract()而不是extract_first(),在第1行和第2行(即从底部),我会再次得到列表

这就是我想要的结果:

[
{"fradato": "2018-12-03", "personer": "1 Voksen", "romnavn": "A room name", "romtype": "Room type A", "rompris": "1088 "}, 
{"fradato": "2018-12-03", "personer": "1 Voksen", "romnavn": "A room name", "romtype": "Room type B", "rompris": "1288 "}]

谢谢你帮一个笨蛋解决基本问题


Tags: textnametypeextractpricexpathclassfirst
1条回答
网友
1楼 · 发布于 2024-04-27 05:02:39

尝试:

def parse(self, response):
     formsel = response.css('form[name=bookingWidget]')
     fradato = formsel.css('input[name=fromDate]::attr(value)').get()
     personer = formsel.css('input[name="room[0].adults"]::attr(value)').get()
     for room in response.css('div.room__collapsable'):
        romnavn = room.css('h2::text').get()
        for prisboks in room.css('div.room-price-info '):
            romtype = prisboks.css('h3::text').get()
            rompris = prisboks.css('span.price::text').get()
            if not romtype or not rompris:
                continue
            yield {"fradato": fradato, "personer": personer, "romnavn": romnavn, "romtype": romtype, "rompris": rompris}

相关问题 更多 >