如何创建@iterator_cache装饰器？

0 投票

2 回答

27 浏览

提问于 2025-04-14 16:47

问题：如何创建一个 iterator_cache 装饰器？

@iterator_cache
def cheap_eval(numbers: Tuple[int]) -> Tuple[int]:
    print("cheap eval of {numbers}")
    return expensive_eval(numbers)

def expensive_eval(numbers: Tuple[int]) -> Tuple[int]:
    time.sleep(5 + len(numbers))  # Example of consuming time
    print(f"expensive eval of {numbers}")
    return tuple(numb**2 for numb in numbers)

描述：我有一个来自某个库的 expensive_eval 函数，我需要多次调用它。

我想创建一个 cheap_eval 函数，它可以使用缓存，并且能给出和 expensive_eval 一样的结果，这样可以减少总的执行时间。

我最初的想法是使用一个辅助函数 eval_one，并加上 @lru_cache，但是每次调用 expensive_eval 都会有5秒的固定延迟，这样做并不好。

我该如何创建这个装饰器，让它记住所有可迭代的输入，而不是整个元组呢？

下面是一个带有输入和期望输出的代码示例：

# Input
print(cheap_eval([1, 2]))
print(cheap_eval([1, 3]))
print(cheap_eval([1, 2, 3]))

# Output without iterator_cache
# total time = 22 seconds
cheap eval of [1, 2]
expensive eval of [1, 2]
[1, 4]
cheap eval of [1, 3]
expensive eval of [1, 3]
[1, 9]
cheap eval of [1, 2, 3]
expensive eval of [1, 2, 3]
[1, 4, 9]

# Output with iterator_cache
# total time = 13 seconds
cheap eval of [1, 2]
expensive eval of [1, 2]
[1, 4]
cheap eval of [1, 3]
expensive eval of [3]
[1, 9]
cheap eval of [1, 2, 3]
[1, 4, 9]

可迭代对象装饰器缓存性能提升代码示例函数优化结果缓存延迟

2 个回答

这个解决方案满足了需求，而且也可以用于其他类型，不仅仅是 tuple。

from collections.abc import Sequence
from typing import Any
import types

def iterator_cache(func):  # Should be sequence_cache
    uncastable = (types.GeneratorType, range)
    cache = {}
    def wrapper(keys: Sequence[Any]) -> Sequence[Any]:
        new_keys = (key for key in keys if key not in cache)
        if not isinstance(keys, uncastable):
            new_keys = keys.__class__(new_keys)
        number_new_keys = sum(1 for _ in new_keys)
        if number_new_keys:
            new_vals = func(new_keys)
            for key, val in zip(new_keys, new_vals):
                cache[key] = val
        values_generator = (cache[key] for key in keys)
        if not isinstance(keys, uncastable):
            values_generator = keys.__class__(values_generator)
        return values_generator
    return wrapper

不过要注意 Iterable 和 Sequence 之间的区别，具体可以参考这个 SO回答

在Python中，最常见的结构有 generator、tuple、list、range、set 和 dictionary。

保持输入和输出的顺序：generator、tuple、list 和 range 都能保持顺序。而 set 和 dictionary 由于输出的 expensive_eval 的哈希值，不会尊重输入的顺序。
保持输入和输出的类型一致：generator、tuple、list 和 set 都是这样。根据输出的值，range 返回的是一个生成器，而不是 range 类型。

需要注意的是，我们假设键是可哈希的。如果不是，那么就不能用字典来存储这些值。

回答于 2025-04-14 由 Python大师

分享举报

所以，简单来说，就是把没有缓存的输入分开，然后对那些未见过的输入调用一个比较耗时的函数。用这个结果来更新缓存。接着，利用缓存来填充输出。

from collections.abc import Sequence

_cache = {}
def cheap_eval(numbers: Sequence[int]) -> list[int]:
    new = [n for n in numbers if n not in _cache]
    # take care of the case where there are no new inputs
    new_results = expensive_eval(new) if new else []

    for n, r in zip(new, new_results):
        _cache[n] = r

    return [_cache[n] for n in numbers]

要确保处理好没有新输入的情况，避免传递一个空列表，这样就不会无缘无故浪费5秒钟的时间。

如果你想的话，可以把上面的逻辑抽象成一个装饰器。但这就是基本的思路。

回答于 2025-04-14 由 Python大师

分享举报

如何创建@iterator_cache装饰器？

2 个回答

撰写回答