为什么在NDArray视图上设置标志会导致分配?它们一定有限吗?
考虑一下这段代码:
import numpy as np
import itertools
def get_view(arr):
view = arr.view()
view.flags.writeable = False # this line causes memory to leak?
return view
def main():
for _ in itertools.count():
get_view(np.zeros(1000))
if __name__ == "__main__":
main()
看起来设置视图为不可写的那一行导致了内存泄漏,虽然我不确定这是否是有界的。
- 为什么会发生这种情况?
- 这一定是有界的吗?还是说这是numpy的一个bug?或者它们可能是引用计数的,但出于某种原因,手动调用垃圾回收器却没有回收它们?
这是同一个程序,添加了tracemalloc逻辑,每100,000次调用get_view时打印一次内存分配情况。
import numpy as np
import tracemalloc
import itertools
import gc
def log_diff(snapshot, prev_snapshot):
diff = snapshot.compare_to(prev_snapshot, "lineno")
reported = 0
for stat in diff:
if "tracemalloc.py" in stat.traceback[0].filename:
continue
if stat.size_diff <= 0:
continue
print(f"#{reported}: {stat}")
reported += 1
print("---")
def get_view(arr):
view = arr.view()
view.flags.writeable = False # this line causes memory to leak?
return view
def main():
tracemalloc.start()
prev_snapshot = None
for i in itertools.count():
get_view(np.zeros(1000))
if i % 100000 == 0:
gc.collect(generation=2)
snapshot = tracemalloc.take_snapshot()
if prev_snapshot is not None:
log_diff(snapshot, prev_snapshot)
prev_snapshot = snapshot
if __name__ == "__main__":
main()
在Linux上使用Python 3.11.6和numpy 1.26.4时,我们得到的内存分配数量似乎是不确定的,但我见过的最大值大约是250。它一开始增长得很快,后来增长得就慢多了。
如果我把设置view.flags.writeable
的那一行注释掉,内存使用量就不会增长。
#0: /home/sami/bug.py:22: size=3534 B (+3477 B), count=62 (+61), average=57 B
#1: /home/sami/bug.py:29: size=84 B (+28 B), count=2 (+1), average=42 B
---
#0: /home/sami/bug.py:22: size=5871 B (+2337 B), count=103 (+41), average=57 B
#1: /home/sami/bug.py:15: size=72 B (+72 B), count=1 (+1), average=72 B
---
---
#0: /home/sami/bug.py:22: size=6270 B (+399 B), count=110 (+7), average=57 B
---
#0: /home/sami/bug.py:22: size=6327 B (+57 B), count=111 (+1), average=57 B
---
#0: /home/sami/bug.py:22: size=7638 B (+1311 B), count=134 (+23), average=57 B
---
#0: /home/sami/bug.py:22: size=7809 B (+171 B), count=137 (+3), average=57 B
---
---
#0: /home/sami/bug.py:22: size=8436 B (+627 B), count=148 (+11), average=57 B
---
#0: /home/sami/bug.py:22: size=8664 B (+228 B), count=152 (+4), average=57 B
---
#0: /home/sami/bug.py:22: size=8892 B (+228 B), count=156 (+4), average=57 B
---
---
#0: /home/sami/bug.py:22: size=9120 B (+228 B), count=160 (+4), average=57 B
---
---
#0: /home/sami/bug.py:22: size=9177 B (+114 B), count=161 (+2), average=57 B
---
...
1 个回答
1
我不太确定这是不是内存泄漏,但我可以给你一个不占用内存的例子:
view.setflags(write=False)
在tracemalloc这个工具下运行时,可以看到这一行并没有占用内存。