如何根据条件对列表进行分割？

392 投票

40 回答

294326 浏览

提问于 2025-04-15 12:02

我有一些代码如下：

good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]

我的目标是把 mylist 里的内容分成两个不同的列表，这个分法是根据它们是否满足某个条件。

我该怎么做得更优雅一些呢？有没有办法避免对 mylist 进行两次遍历？这样做能提高性能吗？

性能优化遍历算法条件过滤列表分割

40 个回答

124

这里介绍一种懒惰迭代器的方法：

from itertools import tee

def split_on_condition(seq, condition):
    l1, l2 = tee((condition(item), item) for item in seq)
    return (i for p, i in l1 if p), (i for p, i in l2 if not p)

它会对每个项目只检查一次条件，然后返回两个生成器，第一个生成器会输出条件为真的序列中的值，第二个则是条件为假的值。

因为它是懒惰的，所以你可以在任何迭代器上使用它，甚至是无限的迭代器：

from itertools import count, islice

def is_prime(n):
    return n > 1 and all(n % i for i in xrange(2, n))

primes, not_primes = split_on_condition(count(), is_prime)
print("First 10 primes", list(islice(primes, 10)))
print("First 10 non-primes", list(islice(not_primes, 10)))

不过通常来说，不使用懒惰的方法，直接返回列表会更好：

def split_on_condition(seq, condition):
    a, b = [], []
    for item in seq:
        (a if condition(item) else b).append(item)
    return a, b

补充：针对你更具体的需求，即根据某个关键字把项目分成不同的列表，这里有一个通用的函数可以做到：

DROP_VALUE = lambda _:_
def split_by_key(seq, resultmapping, keyfunc, default=DROP_VALUE):
    """Split a sequence into lists based on a key function.

        seq - input sequence
        resultmapping - a dictionary that maps from target lists to keys that go to that list
        keyfunc - function to calculate the key of an input value
        default - the target where items that don't have a corresponding key go, by default they are dropped
    """
    result_lists = dict((key, []) for key in resultmapping)
    appenders = dict((key, result_lists[target].append) for target, keys in resultmapping.items() for key in keys)

    if default is not DROP_VALUE:
        result_lists.setdefault(default, [])
        default_action = result_lists[default].append
    else:
        default_action = DROP_VALUE

    for item in seq:
        appenders.get(keyfunc(item), default_action)(item)

    return result_lists

用法：

def file_extension(f):
    return f[2].lower()

split_files = split_by_key(files, {'images': IMAGE_TYPES}, keyfunc=file_extension, default='anims')
print split_files['images']
print split_files['anims']

回答于 2025-04-15 由 Python大师

分享举报

328

手动循环，利用条件来选择一个列表，把每个元素添加到这个列表里：

good, bad = [], []
for x in mylist:
    (bad, good)[x in goodvals].append(x)

回答于 2025-04-15 由 Python大师

分享举报

154

good = [x for x in mylist if x in goodvals]
bad  = [x for x in mylist if x not in goodvals]
有没有更优雅的方法来做这个呢？

这段代码非常容易理解，特别清晰！

# files looks like: [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi'), ... ]
IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')
images = [f for f in files if f[2].lower() in IMAGE_TYPES]
anims  = [f for f in files if f[2].lower() not in IMAGE_TYPES]

再说一次，这样做是没问题的！

使用集合可能会稍微提高性能，但这个差别微不足道。而且我觉得列表推导式更容易阅读，你也不用担心顺序会乱掉，或者重复的项被去掉等等。

实际上，我可能会再“退一步”，直接使用一个简单的for循环：

images, anims = [], []

for f in files:
    if f.lower() in IMAGE_TYPES:
        images.append(f)
    else:
        anims.append(f)

列表推导式或者使用set()都很好，直到你需要添加其他检查或逻辑，比如说你想去掉所有0字节的jpeg文件，你只需要加上类似的东西……

if f[1] == 0:
    continue

回答于 2025-04-15 由 Python大师

分享举报

如何根据条件对列表进行分割？

40 个回答

撰写回答