创建唯一目录列表的优化方法

2条回答

网友
1楼 · 编辑于 2024-06-16 10:23:50

只需构造一个中间默认字典，它使您能够在线性时间内完成此操作。最后转换回所需的结构。你知道吗
inp = [ {'name': 'John', 'hobbies': ['Reading', 'Swimming']}, {'name': 'Gina', 'hobbies': ['Skating', 'Cooking']}, {'name': 'John', 'hobbies': ['Gardening', 'Swimming']} ] from collections import defaultdict temp = defaultdict(set) for d in inp: temp[d['name']].update(d['hobbies']) result = [{'name':k, 'hobbies': list(v)} for k, v in temp.items()]
输出：
[{'name': 'John', 'hobbies': ['Gardening', 'Reading', 'Swimming']}, {'name': 'Gina', 'hobbies': ['Cooking', 'Skating']}]

网友
2楼 · 编辑于 2024-06-16 10:23:50

如果您同意将输出的结构从名称更改为爱好集，则可以在线性时间内完成（忽略边缘情况，即大量哈希冲突）：
from collections import defaultdict data = [ {'name': 'John', 'hobbies': ['Reading', 'Swimming']}, {'name': 'Gina', 'hobbies': ['Skating', 'Cooking']}, {'name': 'John', 'hobbies': ['Gardening', 'Swimming']} ] output = defaultdict(set) for d in data: output[d['name']].update(d['hobbies']) print(output) # defaultdict(<class 'set'>, {'John': {'Reading', 'Swimming', 'Gardening'}, # 'Gina': {'Cooking', 'Skating'}})
如果您坚持使用dict列表，您仍然可以实现几乎线性时间（列表查找仍然是O（n）），但是使用一个逻辑将索引映射到名称：
data = [ {'name': 'John', 'hobbies': ['Reading', 'Swimming']}, {'name': 'Gina', 'hobbies': ['Skating', 'Cooking']}, {'name': 'John', 'hobbies': ['Gardening', 'Swimming']} ] output = [] names_to_indices = {} for d in data: if d['name'] not in names_to_indices: output.append({'name': d['name'], 'hobbies': d['hobbies']}) names_to_indices[d['name']] = len(output) - 1 else: index = names_to_indices[d['name']] for hobbie in d['hobbies']: if hobbie not in output[index]['hobbies']: output[index]['hobbies'].append(hobbie) print(output) # [{'name': 'John', 'hobbies': ['Reading', 'Swimming', 'Gardening']}, # {'name': 'Gina', 'hobbies': ['Skating', 'Cooking']}]
如果您同意业余爱好是一个集合，那么您可以将其设为真正的线性时间（同样，如果我们忽略了过度哈希冲突的可能性）：
data = [ {'name': 'John', 'hobbies': ['Reading', 'Swimming']}, {'name': 'Gina', 'hobbies': ['Skating', 'Cooking']}, {'name': 'John', 'hobbies': ['Gardening', 'Swimming']} ] output = [] names_to_indices = {} for d in data: if d['name'] not in names_to_indices: output.append({'name': d['name'], 'hobbies': set(d['hobbies'])}) names_to_indices[d['name']] = len(output) - 1 else: index = names_to_indices[d['name']] output[index]['hobbies'].update(d['hobbies']) print(output) # [{'name': 'John', 'hobbies': {'Gardening', 'Swimming', 'Reading'}}, # {'name': 'Gina', 'hobbies': {'Skating', 'Cooking'}}]

相关问题更多 >

编程相关推荐

热门问题

热门文章