来自列表列表中找到的关联的字典（性能问题）

base = [['a', 5001, 1, 4, 8], ['b', 5002, 2, 5], ['c', 5002, 2, 5], ['d', 5003, 2, 6, 7], ['e', 5004, 3, 6, 9]] uniques = [1,2,3,4,5,6,7,8,9] uniques_dict = {} for item in uniques: uniques_dict[item] = list(set([records[1] for records in base if item in records[2:]])) print(uniques_dict) Output: { 1: [5001], 2: [5002, 5003], 3: [5004], 4: [5001], 5: [5002], 6: [5003, 5004], 7: [5003], 8: [5001], 9: [5004] }

1条回答

网友

1楼 · 发布于 2024-04-23 14:31:45

与其一次又一次地在所有records上循环，不如反转循环。为快速成员身份测试制作uniques一个集合，只循环records一次。你知道吗

更好的是，该集合可以通过字典键进行处理：

uniques_dict = {u: [] for u in uniques}

for record in base:
    key, values = record[1], record[2:]
    for unique in uniques_dict.keys() & values:  # the intersection
        uniques_dict[unique].append(key)

在python3中，dict.keys()是一个dictionary view object，其作用类似于一个集合。可以使用&and运算符创建与该集的交集。如果您使用的是python2，那么将uniques_dict.keys()替换为^{}以获得完全相同的行为。你知道吗

集合交集是快速而有效的；仍然需要将record[2:]中的每个元素与键集合匹配，但它是O（N）循环而不是O（NK）循环，因为每个键测试都是独立于K=len(unique_keys)的O（1）操作。你知道吗

演示：

>>> base = [['a', 5001, 1, 4, 8],
...         ['b', 5002, 2, 5],
...         ['c', 5002, 2, 5],
...         ['d', 5003, 2, 6, 7],
...         ['e', 5004, 3, 6, 9]]
>>> uniques = [1,2,3,4,5,6,7,8,9]
>>> uniques_dict = {u: [] for u in uniques}
>>> for record in base:
...     key, values = record[1], record[2:]
...     for unique in uniques_dict.keys() & values:  # the intersection
...         uniques_dict[unique].append(key)
... 
>>> uniques_dict
{1: [5001], 2: [5002, 5002, 5003], 3: [5004], 4: [5001], 5: [5002, 5002], 6: [5003, 5004], 7: [5003], 8: [5001], 9: [5004]}

如果uniques是base[*][2:]中所有可能值的严格超集，那么您甚至不必预先计算这些值。只需在执行过程中创建字典键，并在每个record[2:]列表上使用set()来只处理唯一的值。还应设置uniques_dict值以消除添加的重复键：

uniques_dict = {}

for record in base:
    key, values = record[1], record[2:]
    for unique in set(values):
        uniques_dict.setdefault(unique, set()).add(key)

现在list(uniques_dict)是在处理base时生成的unique列表：

>>> uniques_dict = {}
>>> for record in base:
...     key, values = record[1], record[2:]
...     for unique in set(values):
...         uniques_dict.setdefault(unique, set()).append(key)
... 
>>> uniques_dict
{1: {5001}, 2: {5002, 5003}, 3: {5004}, 4: {5001}, 5: {5002}, 6: {5003, 5004}, 7: {5003}, 8: {5001}, 9: {5004}}
>>> list(uniques_dict)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

相关问题更多 >

编程相关推荐

热门问题

热门文章