如何创建@iterator_cache装饰器?

0 投票
2 回答
27 浏览
提问于 2025-04-14 16:47

问题:如何创建一个 iterator_cache 装饰器?

@iterator_cache
def cheap_eval(numbers: Tuple[int]) -> Tuple[int]:
    print("cheap eval of {numbers}")
    return expensive_eval(numbers)

def expensive_eval(numbers: Tuple[int]) -> Tuple[int]:
    time.sleep(5 + len(numbers))  # Example of consuming time
    print(f"expensive eval of {numbers}")
    return tuple(numb**2 for numb in numbers)

描述:我有一个来自某个库的 expensive_eval 函数,我需要多次调用它。

我想创建一个 cheap_eval 函数,它可以使用缓存,并且能给出和 expensive_eval 一样的结果,这样可以减少总的执行时间。

我最初的想法是使用一个辅助函数 eval_one,并加上 @lru_cache,但是每次调用 expensive_eval 都会有5秒的固定延迟,这样做并不好。

我该如何创建这个装饰器,让它记住所有可迭代的输入,而不是整个元组呢?

下面是一个带有输入和期望输出的代码示例:

# Input
print(cheap_eval([1, 2]))
print(cheap_eval([1, 3]))
print(cheap_eval([1, 2, 3]))

# Output without iterator_cache
# total time = 22 seconds
cheap eval of [1, 2]
expensive eval of [1, 2]
[1, 4]
cheap eval of [1, 3]
expensive eval of [1, 3]
[1, 9]
cheap eval of [1, 2, 3]
expensive eval of [1, 2, 3]
[1, 4, 9]

# Output with iterator_cache
# total time = 13 seconds
cheap eval of [1, 2]
expensive eval of [1, 2]
[1, 4]
cheap eval of [1, 3]
expensive eval of [3]
[1, 9]
cheap eval of [1, 2, 3]
[1, 4, 9]

2 个回答

0

这个解决方案满足了需求,而且也可以用于其他类型,不仅仅是 tuple

from collections.abc import Sequence
from typing import Any
import types

def iterator_cache(func):  # Should be sequence_cache
    uncastable = (types.GeneratorType, range)
    cache = {}
    def wrapper(keys: Sequence[Any]) -> Sequence[Any]:
        new_keys = (key for key in keys if key not in cache)
        if not isinstance(keys, uncastable):
            new_keys = keys.__class__(new_keys)
        number_new_keys = sum(1 for _ in new_keys)
        if number_new_keys:
            new_vals = func(new_keys)
            for key, val in zip(new_keys, new_vals):
                cache[key] = val
        values_generator = (cache[key] for key in keys)
        if not isinstance(keys, uncastable):
            values_generator = keys.__class__(values_generator)
        return values_generator
    return wrapper

不过要注意 IterableSequence 之间的区别,具体可以参考这个 SO回答

在Python中,最常见的结构有 generatortuplelistrangesetdictionary

  • 保持输入和输出的顺序:generatortuplelistrange 都能保持顺序。而 setdictionary 由于输出的 expensive_eval 的哈希值,不会尊重输入的顺序。
  • 保持输入和输出的类型一致:generatortuplelistset 都是这样。根据输出的值,range 返回的是一个生成器,而不是 range 类型。

需要注意的是,我们假设键是可哈希的。如果不是,那么就不能用字典来存储这些值。

0

所以,简单来说,就是把没有缓存的输入分开,然后对那些未见过的输入调用一个比较耗时的函数。用这个结果来更新缓存。接着,利用缓存来填充输出。

from collections.abc import Sequence

_cache = {}
def cheap_eval(numbers: Sequence[int]) -> list[int]:
    new = [n for n in numbers if n not in _cache]
    # take care of the case where there are no new inputs
    new_results = expensive_eval(new) if new else []

    for n, r in zip(new, new_results):
        _cache[n] = r

    return [_cache[n] for n in numbers]

要确保处理好没有新输入的情况,避免传递一个空列表,这样就不会无缘无故浪费5秒钟的时间。

如果你想的话,可以把上面的逻辑抽象成一个装饰器。但这就是基本的思路。

撰写回答