仅使用列表列表中的某些项创建dict

2024-06-07 23:19:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个从json文件中读取的列表列表,如下所示:

{'Engine_Information': {'Transmission': '6 Speed Automatic Select Shift', 'Engine_Type': 'Audi 3.2L 6 cylinder 250hp 236ft-lbs', 'Engine_Statistics': {'Horsepower': 250, 'Torque': 236}, 'Hybrid': False, 'Number of Forward Gears': 6, 'Driveline': 'All-wheel drive'}, 'Identification': {'Make': 'Audi', 'Model_Year': '2009 Audi A3', 'ID': '2009 Audi A3 3.2', 'Classification': 'Automatic transmission', 'Year': 2009}, 'Dimensions': {'Width': 202, 'Length': 143, 'Height': 140}, 'Fuel_Information': {'Highway_mpg': 25, 'City_mpg': 18, 'Fuel_Type': 'Gasoline'}}

我需要将其放入一个dict中,其中model是包含以下字段的键:

(Identification.:Model_year,Identification.:Model_id, Engine Information: horsepower,Engine Information:hybrid, Fuel Information:highway_mpg, Fuel Information:city_mpg, Dimensions:width)

我试过这个代码,但它没有;t工作:

  cars_dict = dict((file[0], car[len:]) for car in file)

Tags: 列表modelinformationtypeyearautomaticdictengine
1条回答
网友
1楼 · 发布于 2024-06-07 23:19:54

从你的问题来看,我不确定你想通过以下方式实现什么:

cars_dict = dict((file[0], car[len:]) for car in file)

但是看看你的输出字符串,我想这就是我可以得到它的方法:

car_info = {
    'Engine_Information': {
        'Transmission': '6 Speed Automatic Select Shift', 
        'Engine_Type': 'Audi 3.2L 6 cylinder 250hp 236ft-lbs', 
        'Engine_Statistics': {'Horsepower': 250, 'Torque': 236}, 
        'Hybrid': False, 
        'Number of Forward Gears': 6, 
        'Driveline': 'All-wheel drive'
    }, 
    'Identification': {
        'Make': 'Audi', 
        'Model_Year': 
        '2009 Audi A3', 
        'ID': '2009 Audi A3 3.2', 
        'Classification': 'Automatic transmission', 
        'Year': 2009
    }, 
    'Dimensions': {
        'Width': 202, 
        'Length': 143, 
        'Height': 140
    }, 
    'Fuel_Information': {
        'Highway_mpg': 25, 
        'City_mpg': 18, 
        'Fuel_Type': 'Gasoline'
    }
}

output_elements = {
    'Identification': ['Model_Year', 'ID'],
    'Engine_Information': ['Engine_Statistics', 'Hybrid'],
    'Fuel_Information': ['Highway_mpg', 'City_mpg'],
    'Dimensions': ['Width']
}

def flat(k, v):
    result = []
    [result.append({k: b}) for a,b in v.items() if a in output_elements[k]]
    return result

output = []
[output.extend(o) for o in map(lambda a: flat(a, car_info.get(a)), car_info.keys())]
print(output)

下面是一个输出:

(python3) ➜ Desktop python stack.py

[{'Engine_Information': {'Horsepower': 250, 'Torque': 236}}, {'Engine_Information': False}, {'Identification': '2009 Audi A3'}, {'Identification': '2009 Audi A3 3.2'}, {'Dimensions': 202}, {'Fuel_Information': 25}, {'Fuel_Information': 18}]

需要根据最终输出设置更多格式

如果您对pyspark开放,通过对cars数据应用schema,这里有一个更干净的解决方案:

from pyspark.sql.types import (
    StringType,
    StructField,
    StructType,
    MapType,
    IntegerType
)
from pyspark.sql.functions import udf

car_list = [{
    "Engine_Information": {
        "Transmission": "6 Speed Automatic Select Shift",
        "Engine_Type": "Audi 3.2L 6 cylinder 250hp 236ft-lbs",
        "Engine_Statistics": {
            "Horsepower": 250,
            "Torque": 236
        },
        "Hybrid": False,
        "Number of Forward Gears": 6,
        "Driveline": "All-wheel drive"
    },
    "Identification": {
        "Make": "Audi",
        "Model_Year": "2009 Audi A3",
        "ID": "2009 Audi A3 3.2",
        "Classification": "Automatic transmission",
        "Year": 2009
    },
    "Dimensions": {
        "Width": 202,
        "Length": 143,
        "Height": 140
    },
    "Fuel_Information": {
        "Highway_mpg": 25,
        "City_mpg": 18,
        "Fuel_Type": "Gasoline"
    }
}]

engine_stat_info_schema = StructType([
    StructField('Horsepower', IntegerType(), True),
    StructField('Torque', IntegerType(), True),
])
engine_info_schema = StructType([
    StructField('Transmission', StringType(), True),
    StructField('Engine_Type', StringType(), True),
    StructField('Engine_Statistics', engine_stat_info_schema, True),
    StructField('Hybrid', StringType(), True),
    StructField('Gears', StringType(), True),
    StructField('Driveline', StringType(), True),
])
identification_schema = StructType([
    StructField('Make', StringType(), True),
    StructField('Model_Year', StringType(), True),
    StructField('ID', StringType(), True),
    StructField('Classification', StringType(), True),
    StructField('Year', IntegerType(), True),
])
dimension_schema = StructType([
    StructField('Width', IntegerType(), True),
    StructField('Length', IntegerType(), True),
    StructField('Height', IntegerType(), True),
])
fuel_info_schema = StructType([
    StructField('Highway_mpg', IntegerType(), True),
    StructField('City_mpg', IntegerType(), True),
    StructField('Fuel_Type', StringType(), True),   
])

car_schema = StructType([
    StructField('Engine_Information', engine_info_schema, True),
    StructField('Identification', identification_schema, True),
    StructField('Fuel_Information', fuel_info_schema, True),
    StructField('Dimensions', dimension_schema, True)
])

df = spark.createDataFrame(car_list, schema=car_schema)
df.select('Identification.Model_year', 
    'Identification.ID',
    'Engine_Information.Engine_Statistics.Horsepower',
    'Engine_Information.Hybrid',
    'Fuel_Information.Highway_mpg',
    'Fuel_Information.City_mpg',
    'Dimensions.Width'
).show()

这是一个输出,您可以用任何方式格式化它: enter image description here

相关问题 更多 >

    热门问题