未对循环引用对象进行垃圾回收

2024-05-23 21:00:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个小的便利类,在我的代码中经常使用,如下所示:

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

它的好处是,您可以使用字典键语法或常用的对象样式访问属性:

^{pr2}$

今天,我注意到我的应用程序内存消耗在某种情况下略有增加,而我本以为它会减少的。在我看来,从结构类生成的实例并不是垃圾收集的。下面是一个小片段来说明这一点:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

输出如下:

Structure name:  __16
Structure name:  __16
Structures count:  4096

正如您所注意到的,结构实例计数仍然是4096。在

我评论了一行创建方便的自我参考:

import gc

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        # self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
# print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

现在循环引用被删除,输出就有意义了:

Structure name:  __16
Structures count:  0

我使用Melia进一步推动了测试,以分析内存消耗:

import gc
import pprint
from meliae import scanner
from meliae import loader

class Structure(dict):
    def __init__(self, **kwargs):
        dict.__init__(self, **kwargs)
        self.__dict__ = self

structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])

scanner.dump_all_objects("Test_001.json")
om = loader.load("Test_001.json")
summary = om.summarize()
print summary

structures = om.get_all("Structure")
if structures:
    pprint.pprint(structures[0].c)

生成以下输出:

Structure name:  __16
Structure name:  __16
Structures count:  4096
loading... line 5001, 5002 objs,   0.6 /   1.8 MiB read in 0.2s
loading... line 10002, 10003 objs,   1.1 /   1.8 MiB read in 0.3s
loading... line 15003, 15004 objs,   1.7 /   1.8 MiB read in 0.5s
loaded line 16405, 16406 objs,   1.8 /   1.8 MiB read in 0.5s        
checked        1 /    16406 collapsed        0    
checked    16405 /    16406 collapsed      157    
compute parents        0 /    16249        
compute parents    16248 /    16249        
set parents    16248 /    16249        
collapsed in 0.2s
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0    4096  25   1212416  36  36     296 Structure
     1     390   2    536976  16  52   49432 dict
     2    5135  31    417550  12  65   12479 str
     3      82   0    290976   8  74   12624 module
     4     235   1    212440   6  80     904 type
     5     947   5    121216   3  84     128 code
     6    1008   6    120960   3  88     120 function
     7    1048   6     83840   2  90      80 wrapper_descriptor
     8     654   4     47088   1  92      72 builtin_function_or_method
     9     562   3     40464   1  93      72 method_descriptor
    10     517   3     37008   1  94     216 tuple
    11     139   0     35832   1  95    2280 set
    12     351   2     30888   0  96      88 weakref
    13     186   1     23200   0  97    1664 list
    14      63   0     21672   0  97     344 WeakSet
    15      21   0     18984   0  98     904 ABCMeta
    16     197   1     14184   0  98      72 member_descriptor
    17     188   1     13536   0  99      72 getset_descriptor
    18     284   1      6816   0  99      24 int
    19      14   0      5296   0  99    2280 frozenset
[Structure(4312707312 296B 2refs 2par),
 type(4298634592 904B 4refs 100par 'Structure')]

内存使用量为3.2MiB,删除自引用线将导致以下输出:

Structure name:  __16
Structures count:  0
loading... line 5001, 5002 objs,   0.6 /   1.4 MiB read in 0.1s
loading... line 10002, 10003 objs,   1.1 /   1.4 MiB read in 0.3s
loaded line 12308, 12309 objs,   1.4 /   1.4 MiB read in 0.4s        
checked       12 /    12309 collapsed        0    
checked    12308 /    12309 collapsed      157    
compute parents        0 /    12152        
compute parents    12151 /    12152        
set parents    12151 /    12152        
collapsed in 0.1s
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes)
 Index   Count   %      Size   % Cum     Max Kind
     0     390   3    536976  25  25   49432 dict
     1    5134  42    417497  19  45   12479 str
     2      82   0    290976  13  59   12624 module
     3     235   1    212440  10  69     904 type
     4     947   7    121216   5  75     128 code
     5    1008   8    120960   5  81     120 function
     6    1048   8     83840   4  85      80 wrapper_descriptor
     7     654   5     47088   2  87      72 builtin_function_or_method
     8     562   4     40464   1  89      72 method_descriptor
     9     517   4     37008   1  91     216 tuple
    10     139   1     35832   1  92    2280 set
    11     351   2     30888   1  94      88 weakref
    12     186   1     23200   1  95    1664 list
    13      63   0     21672   1  96     344 WeakSet
    14      21   0     18984   0  97     904 ABCMeta
    15     197   1     14184   0  98      72 member_descriptor
    16     188   1     13536   0  98      72 getset_descriptor
    17     284   2      6816   0  99      24 int
    18      14   0      5296   0  99    2280 frozenset
    19      22   0      2288   0  99     104 classobj

确认结构实例已被破坏,内存使用量降至2.0MiB。在

你知道我怎样才能确保这个类被正确的垃圾回收吗?顺便说一下,所有这些都是在python2.7.2(达尔文)上执行的。在

干杯

托马斯


Tags: nameinselfobjinitcountstructuregc
1条回答
网友
1楼 · 发布于 2024-05-23 21:00:06

通过使用__getattr____setattr__允许属性访问到底层dict,可以更直接地实现Structure类

class Structure(dict):
    def __getattr__(self, k):
        return self[k]
    def __setattr__(self, k, v):
        self[k] = v

在Python中,循环垃圾回收的,但只是周期性的(不同于常规的引用计数对象,这些对象的引用计数一旦下降到0就被收集)。在

避免循环(就像使用__getattr____setattr__的Structure类所做的那样),意味着您将获得更好的gc行为。您可能需要看看collections.namedtuple作为一个好的替代方案:它并不完全执行您已经实现的功能,但也许它适合您的目的。在

相关问题 更多 >