当使用另一个列表计算一个列表中元素的出现次数时,有什么更快的方法

2024-07-21 10:29:48 发布

您现在位置:Python中文网/ 问答频道 /正文

如果我有两个列表,List_AList_B,如果我想从List_A计算List_B中每个元素的出现次数并在新的List_C中解析结果,有什么更快的方法? 通常Id使用列表理解,但是一旦List_AList_B中的元素数量增加到100000以上,它就开始花费大量时间

List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a'] List_B = ['a', 'b','c','d','e','f','g','h'] List_C = [List_A.count(x) for x in List_B] List_C #Output: #List_C = [3, 1, 2, 2, 1, 3, 1, 3]

Tags: 方法inid元素列表foroutput数量
2条回答
def a():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h', 'z']

    count = Counter(List_A)
    List_C = [count.get(x) for x in List_B]

100000个循环,最好为5个:每个循环2.96 usec


使用计数器()并检查无:

def b():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h', 'z']

    count = Counter(List_A)

    List_C = []

    for x in List_B:
        val = count.get(x)
        if val != None:
            List_C.append(val)

100000个循环,最好为5个:每个循环3.45 usec


不检查“无”值:

def c():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h', 'z']

    count = Counter(List_A)

    List_C = []

    for x in List_B:
        List_C.append(count.get(x))

100000个回路,最好为5个:每个回路3.04 usec


使用@Mafa的解决方案,如果列表_B中的值未出现在列表_A中,则该解决方案不起作用:

def d():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h']
    occ = dict()
    for x in List_A:
        occ.setdefault(x, 0)
        occ[x] += 1
    List_C = [occ[x] for x in List_B]

100000个循环,最好为5个循环:每个循环2.59 usec


检查现有值的Mafa解决方案:

def e():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h']
    occ = dict()
    for x in List_A:
        occ.setdefault(x, 0)
        occ[x] += 1
    List_C = [occ.get(x, 0) for x in List_B]

100000个循环,最好为5个:每个循环3.1 usec


接下来的两个功能由@Alex Waygood提出

def f():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h']
    c = Counter(filter(set(List_B).__contains__, List_A))
    List_C = [v for k, v in sorted(c.items())]

50000个循环,最好为5个:每个循环4.45 usec


def g():
    List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
    List_B = ['a', 'b','c','d','e','f','g','h', 'z']
    c = Counter(filter(List_B.__contains__, List_A))
    List_C = [v for k, v in sorted(c.items())]

50000个循环,最好为5个:每个循环4.78 usec

(不确定是否需要除以2,这里显然不是,因为有50000个循环而不是100000个,如果是这样,我们这里有一个明显的赢家)

使用您的解决方案,您执行的计算数量等于len(List_A) * len(List_B)(您的列表理解)

相反,首先计算发生次数,然后进行列表理解:

List_A = ['a', 'd','f','h','g','e','f','a','f','h','h','d','b','c','c','a']
List_B = ['a', 'b','c','d','e','f','g','h']
occ = dict()
for x in List_A:
    occ.setdefault(x, 0)
    occ[x] += 1
List_C = [occ.get(x, 0) for x in List_B]

通过这种方式,您可以遍历List_A一次和List_B一次

[编辑]

更新了最后一行的列表理解,以解决x不在List_A中的情况(参见@AchilleG的评论)

相关问题 更多 >