词典的大小在哪里变化?

2024-04-26 09:27:51 发布

您现在位置:Python中文网/ 问答频道 /正文

在我的Python Utilities Github repo中,有一个函数可以从字符串、映射和序列中删除非打印字符和无效的Unicode字节:

def filterCharacters(s):
    """
    Strip non printable characters

    @type s dict|list|tuple|bytes|string
    @param s Object to remove non-printable characters from

    @rtype dict|list|tuple|bytes|string
    @return An object that corresponds with the original object, nonprintable characters removed.
    """

    validCategories = (
        'Lu', 'Ll', 'Lt', 'LC', 'Lm', 'Lo', 'L', 'Mn', 'Mc', 'Me', 'M', 'Nd', 'Nl', 'No', 'N', 'Pc',
        'Pd', 'Ps', 'Pe', 'Pi', 'Pf', 'Po', 'P', 'Sm', 'Sc', 'Sk', 'So', 'S', 'Zs', 'Zl', 'Zp', 'Z'
    )
    convertToBytes = False

    if isinstance(s, dict):
        new = {}
        for k,v in s.items(): # This is the offending line
            new[k] = filterCharacters(v)
        return new

    if isinstance(s, list):
        new = []
        for item in s:
            new.append(filterCharacters(item))
        return new

    if isinstance(s, tuple):
        new = []
        for item in s:
            new.append(filterCharacters(item))
        return tuple(new)

    if isinstance(s, bytes):
        s = s.decode('utf-8')
        convertToBytes = True

    if isinstance(s, str):
        s = ''.join(c for c in s if unicodedata.category(c) in validCategories)
        if convertToBytes:
            s = s.encode('utf-8')
        return s

    else:
        return None

有时此函数会引发异常:

Traceback (most recent call last):
  File "./util.py", line 56, in filterCharacters
    for k,v in s.items():
RuntimeError: dictionary changed size during iteration

我看不出我要把作为论据寄来的词典改在哪里了。那么,为什么要抛出这个异常呢?你知道吗

谢谢!你知道吗


Tags: 函数innewforreturnifbytesitem
1条回答
网友
1楼 · 发布于 2024-04-26 09:27:51

在python3中dict.items()返回dict_view对象(而不是像python2中那样list)。通过查看CPython代码,我注意到如下注释

Objects/dictobject.c

dict_items(register PyDictObject *mp) 
{
    ...
    /* Preallocate the list of tuples, to avoid allocations during
     * the loop over the items, which could trigger GC, which
     * could resize the dict. :-(
     */
    ...

    if (n != mp->ma_used) {
        /* Durnit.  The allocations caused the dict to resize.
         * Just start over, this shouldn't normally happen.
         */
        Py_DECREF(v);
        goto again;
    }
    ...
}

因此,不仅dict删除和插入会导致显示此错误,还会导致任何分配!面向对象!你知道吗

调整大小的过程也很有趣。看看

static int
dictresize(PyDictObject *mp, Py_ssize_t minused)
{
    ...
}

但这都是内在的。你知道吗

解决方案

请尝试使用将dict_view转换为list

if isinstance(s, dict):
    new = {}
    items = [i for i in s.items()]
    for k,v in items:
        new[k] = filterCharacters(v)
    return new

相关问题 更多 >