根据两个键筛选词典列表

2024-06-07 04:16:37 发布

您现在位置:Python中文网/ 问答频道 /正文

with open('test.csv') as f:
    list_of_dicts = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)]

你好,我有一个csv文件,我做了一个字典列表,我想过滤它的输出在ASIN(删除重复,如果找到)的基础上“商家1价格”我想得到较低的价格,不是所有的人都有重复,即删除重复(保留一个最低的商家1价格),并保留非重复(在一个新的名单),这是一份清单样本

{'Product Name': 'NFL Buffalo Bills Bedding Set, Twin', 'Amazon Price': '84.99', 'ASIN': 'B004B3M5UU', 'Merchant_1': 'Homedepot', 'Merchant_1_Price': '72.65', 'Merchant_1_Stock': 'False', 'Merchant_1_Link': 'https://www.homedepot.com/p/Jaguars-2-PIECE-Draft-Multi-Twin-Comforter-Set-1NFL862000014RET/303181069', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}
{'Product Name': 'NFL Buffalo Bills Bedding Set, Twin', 'Amazon Price': '84.99', 'ASIN': 'B004B3M5UU', 'Merchant_1': 'Overstock', 'Merchant_1_Price': '61.64', 'Merchant_1_Stock': 'False', 'Merchant_1_Link': 'https://www.overstock.com/Bedding-Bath/The-Northwest-Company-NFL-Buffalo-Bills-Draft-Twin-2-piece-Comforter-Set/13330480/product.html', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}
{'Product Name': 'EGO Power+ HT2400 24-Inch 56-Volt Lithium-ion Cordless Hedge Trimmer - Battery and Charger Not Included', 'Amazon Price': '129.0', 'ASIN': 'B00N0A4S1O', 'Merchant_1': 'Homedepot', 'Merchant_1_Price': '129.00', 'Merchant_1_Stock': 'True', 'Merchant_1_Link': 'https://www.homedepot.com/p/EGO-24-in-56-Volt-Lithium-Ion-Cordless-Hedge-Trimmer-Battery-and-Charger-Not-Included-HT2400/205163108', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}

我尝试了很多两个for循环,但似乎找不到正确的代码逻辑

感谢您的帮助


Tags: csvnameinimageamazonformerchant价格
1条回答
网友
1楼 · 发布于 2024-06-07 04:16:37

对dict列表进行重复数据消除的最简单方法是构建一个由unique字段(在本例中是'ASIN')键控的字典。找到副本时,可以选择具有较低'Merchant_1_Price'字段的副本:

by_asin = {}
for item in list_of_dicts:
    asin = item['ASIN']
    if (
        asin not in by_asin or
        float(item['Merchant_1_Price']) < float(by_asin[asin]['Merchant_1_Price'])
    ):
        by_asin[asin] = item

deduplicated_list_of_dicts = list(by_asin.values())

在循环中,我们首先从当前项中提取asin,因为我们要多次使用它。然后我们检查ASIN是否还没有在by_asin字典中,或者如果它在那里,我们检查新项目的价格是否低于旧项目的价格。在这两种情况下,我们都将新项放入by_asin字典(如果有,则替换以前的值)

相关问题 更多 >