删除列表中符合条件的字典

2024-03-29 03:43:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我有下面的字典列表,我需要删除具有相同received_oncustomer_group值的字典,但留下一个随机项。你知道吗

data = [
    {
        'id': '16e26a4a9f97fa4f',
        'received_on': '2019-11-01 11:05:51',
        'customer_group': 'Life-time Buyer'
    },
    {
        'id': '16db0dd4a42673e2',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    },
    {
        'id': '16db0dd4199f5897',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    }
]

预期产量:

[
    {
        'id': '16e26a4a9f97fa4f',
        'received_on': '2019-11-01 11:05:51',
        'customer_group': 'Life-time Buyer'
    },
    {
        'id': '16db0dd4199f5897',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'

    }
]

Tags: id列表data字典timeongroupcustomer
3条回答

有个主意:

import random

data = [
    {
        'id': '16e26a4a9f97fa4f',
        'received_on': '2019-11-01 11:05:51',
        'customer_group': 'Life-time Buyer'
    },
    {
        'id': '16db0dd4a42673e2',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    },
    {
        'id': '16db0dd4199f5897',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    }
]


r_data = data.copy()
random.shuffle(r_data)
unique_data = {(elem['received_on'],elem['customer_group']):elem['id'] 
                for elem in data}
new_data = [{'id':val, 'received_on':key[0],'customer_group':key[1]} 
                for key,val in unique_data.items()]
new_data = sorted(new_data,key = lambda x:data.index(x)) #if you need sorted
print(new_data)

输出:

[{'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer'}, {'id': '16db0dd4199f5897', 'received_on': '2019-10-09 14:12:29', 'customer_group': 'Lead'}]

这里有一种获取第一个唯一datetime的方法,如果您想要随机项,您可以像here中那样首先无序排列列表

data = [
    {
        'id': '16e26a4a9f97fa4f',
        'received_on': '2019-11-01 11:05:51',
        'customer_group': 'Life-time Buyer'
    },
    {
        'id': '16db0dd4a42673e2',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    },
    {
        'id': '16db0dd4199f5897',
        'received_on': '2019-10-09 14:12:29',
        'customer_group': 'Lead'
    }
]

datetime = set()
result = []
for d in data:
    dt = d['received_on']
    if dt not in datetime:
        result.append(d)
        datetime.add(dt)
result

输出:

[{'id': '16e26a4a9f97fa4f',
  'received_on': '2019-11-01 11:05:51',
  'customer_group': 'Life-time Buyer'},
 {'id': '16db0dd4a42673e2',
  'received_on': '2019-10-09 14:12:29',
  'customer_group': 'Lead'}]

利用上面的一些想法,我还想将customer_group作为received_on之外的另一个条件。我得到了预期的结果。你知道吗

conditions, result = [], []
for d in data:
    condition = (d['received_on'], d['customer_group'])
    if condition not in conditions:
        result.append(d)
        conditions.append(condition)
print(len(result))

相关问题 更多 >