应用聚类算法对相似参与者进行分组

2024-04-25 16:40:04 发布

您现在位置:Python中文网/ 问答频道 /正文

data=pd.read_csv('movie_actor_network.csv', index_col=False, names=['movie','actor'])
from gensim.models import Word2Vec
model = Word2Vec(walks, size=128, window=5)
model.wv.vectors.shape

输出:

(4703, 128)

节点的ID

node_ids = model.wv.index2word  # list of node IDs
node_embeddings = model.wv.vectors  # numpy.ndarray of size number of nodes times embeddings dimensionality
node_targets = [ A.node[node_id]['label'] for node_id in node_ids]

现在使用函数

 def data_split(node_ids,node_targets,node_embeddings):
        '''In this function, we will split the node embeddings into actor_embeddings , movie_embeddings '''
        actor_nodes,movie_nodes=[],[]
        actor_embeddings,movie_embeddings=[],[]
        # split the node_embeddings into actor_embeddings,movie_embeddings based on node_ids
        actor_embedding = [x for i,x in enumerate(node_embeddings) if node_targets[i]=='actor']
        actor_embeddings.append(actor_embedding)
        actor_node = [x for i,x in enumerate(node_ids) if node_targets[i]=='actor']
        actor_nodes.append(actor_node)
        movie_embedding = [x for i,x in enumerate(node_embeddings) if node_targets[i]=='movies']
        movie_embeddings.append(movie_embedding)
        movie_node = [x for i,x in enumerate(node_ids) if node_targets[i]=='movie']
        movie_nodes.append(movie_node)
    
        # By using node_embedding and node_targets, we can extract actor_embedding and movie embedding
        # By using node_ids and node_targets, we can extract actor_nodes and movie nodes
    
        return actor_nodes,movie_nodes,actor_embeddings,movie_embeddings

平地机功能-1

def grader_actors(data):
    assert(len(data)==3411)
    return True
grader_actors(actor_nodes)

现在有错误了

NameError                                 Traceback (most recent call last)
<ipython-input-30-ee1852cb1df5> in <module>
      2     assert(len(data)==3411)
      3     return True
----> 4 grader_actors(actor_nodes)

NameError: name 'actor_nodes' is not defined

我们怎样才能解决它


Tags: innodeidsfordatamodelifembedding
1条回答
网友
1楼 · 发布于 2024-04-25 16:40:04

该函数将返回列表中的列表,因此您仍然会收到def grader_actors(数据)抛出的断言错误。如果希望grader_actor函数不抛出断言错误,请使用以下代码:

def data_split(node_ids,node_targets,node_embeddings):
    '''In this function, we will split the node embeddings into actor_embeddings , movie_embeddings '''
    actor_nodes,movie_nodes=[],[]
    actor_embeddings,movie_embeddings=[],[]
    actor_embedding = [actor_embeddings.append(x) for i,x in enumerate(node_embeddings) if node_targets[i]=='actor']
    actor_node = [actor_nodes.append(x) for i,x in enumerate(node_ids) if node_targets[i]=='actor']
    movie_embedding = [movie_embeddings.append(x) for i,x in enumerate(node_embeddings) if node_targets[i]=='movies']
    movie_node = [movie_nodes.append(x) for i,x in enumerate(node_ids) if node_targets[i]=='movie']

    return actor_nodes,movie_nodes,actor_embeddings,movie_embeddings

相关问题 更多 >