有人能解释一下这个清单吗?

2024-04-18 12:49:22 发布

您现在位置:Python中文网/ 问答频道 /正文

def unpack_dict(matrix, map_index_to_word):
    table = sorted(map_index_to_word, key=map_index_to_word.get)      
    data = matrix.data
    indices = matrix.indices
    indptr = matrix.indptr        
    num_doc = matrix.shape[0]    
    return [{k:v for k,v in zip([table[word_id] for word_id in 
    indices[indptr[i]:indptr[i+1]] ],
    data[indptr[i]:indptr[i+1]].tolist())} \
               for i in range(num_doc) ]

wiki['tf_idf'] = unpack_dict(tf_idf, map_index_to_word)

enter image description here

把索引映射到单词词典单词:索引几千字。 tfèidf是TFIDF稀疏向量 DataFrame wiki显示在此处的屏幕截图中


Tags: toinmapfordataindextftable
2条回答
[{k: v for k, v in zip([table[word_id] for word_id in indices[indptr[i]:indptr[i + 1]]],data[indptr[i]:indptr[i + 1]].tolist())} for i in range(num_doc)]

同:

final_list = []
for i in range(num_doc):
    new_list = []
    for word_id in indices[indptr[i]:indptr[i + 1]]:
        new_list.append(table[word_id])

    new_dict = {}
    for k, v in zip(new_list, data[indptr[i]:indptr[i + 1]].tolist()):
        new_dict[k] = v
    final_list.append(new_dict)

这个?你知道吗

[{k:v for k,v in zip([table[word_id] for word_id in 
    indices[indptr[i]:indptr[i+1]] ],
    data[indptr[i]:indptr[i+1]].tolist())} \
               for i in range(num_doc) ]

外在的理解是

[... for i in range(num_doc) ]

只是一个简单的循环num_doc次。你知道吗

里面有一本字典。你知道吗

{k:v for k,v in zip()}

zip从以下位置获取k键:

[table[word_id] for word_id in indices[indptr[i]:indptr[i+1]] ]

v值来自:

data[indptr[i]:indptr[i+1]].tolist()

因此i,外部变量创建了切片范围indptr[i]:indptr[i+1]。你知道吗

所以它在列一个字典清单。字典键来自table[word_id],其中word_id位于indices的范围内,值是data的对应范围。你知道吗

相关问题 更多 >