用多行展平dataframe列中的JSON字符串或将其传输到其他df

2024-04-30 06:03:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我在jupyter notebook中有一个数据帧,其中包含以下列:

id_pedido   filial  transportadora  cep endereco    regiao  data_despacho   peso    numero_skus source  delivery_combinations   delivery_options

“delivery_combionations”列的每一行都有一个JSON字符串,如下所示:

[{"cost": 10.81, "default": true, "position": 3, "calculator": "EconomyCalculatorWithQuotationDifference", "deliveries": [{"delivery_items": [{"id": 139083369, "quantity": 1}], "delivery_rates": [{"cost": 10.81, "selected": true, "carrier_cost": 10.1, "initial_cost": 10.81, "quotation_id": 13125739277, "delivery_note": null, "cd_business_days": 1, "shipping_carrier": "Sequoia Transportadora", "shipping_service": "Sequoia Redespacho - PE", "quotation_difference": 0.0, "estimate_business_days": 15, "external_delivery_method_id": 9279, "estimate_transit_time_business_days": 14}], "stock_location": "Recife", "stock_location_external_id": 9}], "initial_cost": 10.81, "combination_id": 164762042, "delivery_method": "economy", "estimate_business_days": 15, "estimate_delivery_date": "15/10/20"}, {"cost": 41.71, "default": false, "position": 4, "calculator": "ExpressCalculator", "deliveries": [{"delivery_items": [{"id": 139083369, "quantity": 1}], "delivery_rates": [{"cost": 41.71, "selected": true, "carrier_cost": 38.98, "initial_cost": 41.71, "quotation_id": 13125730459, "delivery_note": null, "cd_business_days": 1, "shipping_carrier": "JadLog", "shipping_service": "JadLog Standard", "quotation_difference": 0.0, "estimate_business_days": 9, "external_delivery_method_id": 22, "estimate_transit_time_business_days": 8}], "stock_location": "Extrema", "stock_location_external_id": 3}], "initial_cost": 41.71, "combination_id": 164762042, "delivery_method": "express", "estimate_business_days": 9, "estimate_delivery_date": "06/10/20"}]

因为每一行都有自己的json字符串,所以我想为每个字段创建列并添加值​​每行。我尝试了几种方法

使用ast.literal_eva,我得到了错误:

malformed node or string: <_ast.Subscript object at 0x7f414c2eb190>

使用json.loads,我得到了错误:

Expecting ',' delimiter: line 1 column 16383 (char 16382)

有没有办法在其他列中展平此数据或将其附加到另一个数据帧中


Tags: 数据idstocklocationbusinessdaysmethodexternal
1条回答
网友
1楼 · 发布于 2024-04-30 06:03:24

首先需要修改json中的true/false/null值。这些变量名被视为无法识别的变量名,因为它们没有引号。如果将它们转换为True/False/Nonepd.json_normalize将很好地解析它:

val = [{
    "cost": 10.81,
    "default": True,
    "position": 3,
    "calculator": "EconomyCalculatorWithQuotationDifference",
    "deliveries": [{
            "delivery_items": [{
                    "id": 139083369,
                    "quantity": 1
                }
            ],
            "delivery_rates": [{
                    "cost": 10.81,
                    "selected": True,
                    "carrier_cost": 10.1,
                    "initial_cost": 10.81,
                    "quotation_id": 13125739277,
                    "delivery_note": None,
                    "cd_business_days": 1,
                    "shipping_carrier": "Sequoia Transportadora",
                    "shipping_service": "Sequoia Redespacho - PE",
                    "quotation_difference": 0.0,
                    "estimate_business_days": 15,
                    "external_delivery_method_id": 9279,
                    "estimate_transit_time_business_days": 14
                }
            ],
            "stock_location": "Recife",
            "stock_location_external_id": 9
        }
    ],
    "initial_cost": 10.81,
    "combination_id": 164762042,
    "delivery_method": "economy",
    "estimate_business_days": 15,
    "estimate_delivery_date": "15/10/20"
}, {
    "cost": 41.71,
    "default": False,
    "position": 4,
    "calculator": "ExpressCalculator",
    "deliveries": [{
            "delivery_items": [{
                    "id": 139083369,
                    "quantity": 1
                }
            ],
            "delivery_rates": [{
                    "cost": 41.71,
                    "selected": True,
                    "carrier_cost": 38.98,
                    "initial_cost": 41.71,
                    "quotation_id": 13125730459,
                    "delivery_note": None,
                    "cd_business_days": 1,
                    "shipping_carrier": "JadLog",
                    "shipping_service": "JadLog Standard",
                    "quotation_difference": 0.0,
                    "estimate_business_days": 9,
                    "external_delivery_method_id": 22,
                    "estimate_transit_time_business_days": 8
                }
            ],
            "stock_location": "Extrema",
            "stock_location_external_id": 3
        }
    ],
    "initial_cost": 41.71,
    "combination_id": 164762042,
    "delivery_method": "express",
    "estimate_business_days": 9,
    "estimate_delivery_date": "06/10/20"
}

]

output = pd.json_normalize(val)

相关问题 更多 >