我正在打印脚本的输出。但为了这个我必须用很多指纹。有没有什么方法可以让所有的主题都不被打印出来?你知道吗
import pandas
import mglearn
import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
dataset = pandas.read_csv('text.csv', encoding = 'utf-8')
comments = dataset['comments']
comments_list = remove_small_words.values.tolist()
vector = CountVectorizer()
X = vector.fit_transform(comments_list)
lda = LatentDirichletAllocation(n_components = 30, learning_method = "batch", max_iter = 25, random_state = 0)
document_topics = lda.fit_transform(X)
sorting = np.argsort(lda.components_, axis = 1)[:, ::-1]
feature_names = np.array(vector.get_feature_names())
topics = mglearn.tools.print_topics(topics = range(30), feature_names = feature_names, sorting = sorting, topics_per_chunk = 5, n_words = 10)
print(topics)
print("Topic 0:")
docs = np.argsort(document_topics[:, 0])[::-1]
for i in docs[:]:
print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()
print("Topic 1:")
docs = np.argsort(document_topics[:, 1])[::-1]
for i in docs[:]:
print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()
...
print("Topic 40:")
docs = np.argsort(document_topics[:, 40])[::-1]
for i in docs[:]:
print(" ".join(comments_list[i].encode('utf-8').split(",")[:2]) + "\n")
print()
print()
例如,我可以不打印40次,而是循环打印所有内容吗?要打印这40个主题,我需要240行代码。想象一下如果我需要打印100。。。 我有这个输出,我想保留它:
Topic 0:
blabla
blabla
Topic 1:
blabla
blabla
Topic 3:
blabla
blabla
...
可以使用字符串格式来确定每个主题要打印的字符串:
然后,由于您有
i
,您可以添加其他语句,如下所示:相关问题 更多 >
编程相关推荐