如何从引导lda输出制作wordcloud

2024-04-24 23:20:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用guidedlda包-https://github.com/vi3k6i5/GuidedLDA创建了带有一些初始种子的主题模型。看起来不错。但是现在我想看看每个主题的频率分布和词云。我该怎么做?你知道吗

我在每一个主题中访问前10个单词

>>> n_top_words = 10
>>> topic_word = model.topic_word_
>>> for i, topic_dist in enumerate(topic_word):
>>>     topic_words = np.array(vocab)[np.argsort(topic_dist)][:-(n_top_words+1):-1]
>>>     print('Topic {}: {}'.format(i, ' '.join(topic_words)))
Topic 0: game play team win season player second point start victory
Topic 1: company percent market price business sell executive pay plan sale
Topic 2: play life man music place write turn woman old book
Topic 3: official government state political leader states issue case member country
Topic 4: school child city program problem student state study family group

但是,如何找到每个单词出现在某个主题中的次数并在该主题上生成单词云呢?因为我不确定这个模型是否能捕捉单词的频率。你知道吗

提前谢谢。你知道吗


Tags: https模型主题playtopicdisttopnp