word2vec错误:'_Token'对象不可迭代

2024-04-19 14:45:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着按顺序把句子列表输入gensim.models.Word2Vec,但它生成TypeError:“\u Token”对象不可读取。我该怎么办?在

    embedding_model= Word2Vec()
    for index, sentence_list in df.iterrows():
        embedding_model = Word2Vec(sentence_list, size=100, window=5, min_count=2, workers=2)
        embedding_model.train(tokenized_contents, total_examples=len(tsentence_list), epochs=10)

Tags: 对象token列表forindexmodel顺序models
1条回答
网友
1楼 · 发布于 2024-04-19 14:45:20

谢谢你的快速反应。我的错误是把纯句子而不是标记列表放在一起。但是,我仍然在为Word2Vec的顺序输入而挣扎。以下是我的示例数据、代码和错误:

tokenized_contents: ['こんにちは', '!', '掲示', '板', 'が', 'でき', 'まし', 'た', 'ね', '!', 'これ', 'から', 'も', 'よろしく', 'お', '願い', 'し', 'ます', '!']

embedding_model= Word2Vec()
for index, tokenized_contents in df.iterrows():
    embedding_model = Word2Vec(tokenized_contents, size=100, window=5, min_count=1, workers=4)
    embedding_model.build_vocab(tokenized_contents)

embedding_model.train(tokenized_contents, total_examples=len(tokenized_contents), epochs=10)

Error Msg:
Traceback (most recent call last):

    embedding_model.build_vocab(tokenized_contents)
  File "/anaconda3/envs/japan/lib/python3.6/site-packages/gensim/models/base_any2vec.py", line 484, in build_vocab
    trim_rule=trim_rule, **kwargs)
  File "/anaconda3/envs/japan/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1318, in prepare_vocab
    self.sort_vocab(wv)
  File "/anaconda3/envs/japan/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1184, in sort_vocab
    raise RuntimeError("cannot sort vocabulary after model weights already initialized.")
RuntimeError: cannot sort vocabulary after model weights already initialized.
'''

相关问题 更多 >