我试图在数据集上应用bi集群。我正在关注这个guide
import numpy as np
from matplotlib import pyplot as plt
import pandas as pd
from sklearn.datasets import make_biclusters
from sklearn.datasets import samples_generator as sg
from sklearn.cluster.bicluster import SpectralCoclustering
# make some fake data for this question
data, rows, columns = make_biclusters(
shape=(20, 20), n_clusters=2, noise=5,
shuffle=False, random_state=0)
data, row_idx, col_idx = sg._shuffle(data, random_state=0) # shuffle it
# my real data is in a pandas df WITH column names. These are of course just placeholder
df = pd.DataFrame(data)
colum_names = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t']
df.columns = colum_names
# Converting from pandas to np removes the columns labels
data = np.array(data)
# show the data, with column labels.
# There was no re-ordering, the labels are still correct
plt.imshow(data)
plt.xticks(range(0,len(colum_names)),colum_names)
plt.yticks(range(0,len(colum_names)),colum_names)
plt.title("Original dataset")
现在,我应用bi集群模型。这会“洗牌”列/行,从而使轴标签不正确
model = SpectralCoclustering(2)
model.fit(data)
fit_data = data[np.argsort(model.row_labels_)]
fit_data = fit_data[:, np.argsort(model.column_labels_)]
plt.imshow(fit_data)
plt.title("After biclustering; rearranged to show biclusters")
plt.xticks(range(0,len(colum_names)),colum_names)
plt.yticks(range(0,len(colum_names)),colum_names)
plt.colorbar()
我的问题。如何应用标签列上应用的相同重新排序,以便重新排序的图形中的标签是正确的
目前没有回答
相关问题 更多 >
编程相关推荐