列表搜索优化

Question

first = [(1, text, text, 1, 2, 3), 
         (1, text, text, 1, 0, 3), ... (6054, text, text, 2, 2, 3)]
second = (1, 2, 3, 4, 5 ... 5412)

data = [x for x in first if x[0] in second]

有没有更快的方法来做到这一点：

Answer 1

也许你只想要这个，而不是用 in 来检查：

data = [x for x in first if 1 <= x[0] <= 5412 ]

Answer 2

试试这个：

first = [(1, text, text, 1, 2, 3), 
         (1, text, text, 1, 0, 3), ... (1054, text, text, 2, 2, 3)]
second = (1, 2, 3, 4, 5 ... 5412)
second_set = set(second)
data = [x for x in first if x[0] in second_set]

假设第一个集合有 m 个元素，第二个集合有 n 个元素。

集合是通过哈希表来存储的，所以查找集合里的元素几乎是 O(1) 的时间，这样整体效率就是 O(m)。而如果把第二个集合当作列表来查找，那查找的时间是 O(n)，这样整体效率就变成 O(m * n) 了。

列表搜索优化

2 个回答

撰写回答