在层次聚类中从聚类标签计算类的概率？

2024-05-16 14:41:39 发布

男 | 程序猿一只，喜欢编程写python代码。

我有一个数据帧，有两个类“yes”和“no”。使用scipy层次聚类，我发现了2个聚类。这是我的密码

from scipy.cluster.hierarchy import linkage, dendrogram
from scipy.spatial.distance import pdist
from scipy.cluster.hierarchy import fcluster
Mdist_matrix = pdist(x_Minmax, metric= 'cityblock')
MSlink = linkage (Mdist_matrix , method = 'single' , metric = 'cityblock')
crsm = fcluster(MClink, k , criterion='maxclust')
arr = np.unique(crsm, return_counts = True)
# print(arr)
dfcluster= dfcluster.copy()
dfcluster['Clabels'] = pd.Series(crsm, index=dfcluster.index)
No = dfcluster[df['status'] == 0]['Clabels'].value_counts()
print("CNO\n",No)
Yes= dfcluster[df['status'] == 1]['Clabels'].value_counts()
print("Cyes\n",Yes)

The output looks like this one

我想计算每个团簇的熵和粒子的纯度集群。如何我能计算每个簇中“是”和“否”的概率吗？我试图这样做Fastest way to compute entropy in python但我不清楚。你知道吗

Tags： from import hierarchy 聚类 scipy cluster print counts

1条回答

网友

1楼 · 发布于 2024-05-16 14:41:39

我为纯洁负责。您的列联矩阵（如果您不熟悉，请参见this）如下所示：

      |   1  |   2 |
   |   |  -|
 CNO  | 7244 | 544 |
   |   |  -|
 CYES | 2136 |  76 |
         -+

然后，有一个公式可以从列联矩阵计算纯度：

purity_score = np.sum(np.amax(contingency_matrix, axis=0)) / np.sum(contingency_matrix)

在层次聚类中从聚类标签计算类的概率？

相关问题更多 >

编程相关推荐

热门问题

热门文章

在层次聚类中从聚类标签计算类的概率？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >