scikit学习kmeans聚类的初始质心

#numpy array of initial centroids startpts=np.array([[-0.12, 0.939, 0.321, 0.011], [0.0, 0.874, -0.486, 0.862], [0.0, 1.0, 0.0, 0.033], [0.12, 0.939, 0.321, -0.7], [0.0, 1.0, 0.0, -0.203], [0.12, 0.939, -0.321, 0.25], [0.0, 0.874, 0.486, -0.575], [-0.12, 0.939, -0.321, 0.961]], np.float64) centroids= sk.KMeans(n_clusters=8, init=startpts, n_init=1) centroids.fit(actual_data_points) #get the array centroids_array=centroids.cluster_centers_

1条回答

网友

1楼 · 发布于 2024-05-19 22:11:50

是的，通过init设置初始质心应该可以工作。以下是scikit learndocumentation的一段引述：

 init : {‘k-means++’, ‘random’ or an ndarray}

     Method for initialization, defaults to ‘k-means++’:   

     If an ndarray is passed, it should be of shape (n_clusters, n_features)
     and gives the initial centers.

What is the shape (n_clusters, n_features) referring to?

形状要求意味着init必须正好有n_clusters行，并且每行中的元素数应与actual_data_points的维度匹配：

>>> init = np.array([[-0.12, 0.939, 0.321, 0.011],
                     [0.0, 0.874, -0.486, 0.862],
                     [0.0, 1.0, 0.0, 0.033],
                     [0.12, 0.939, 0.321, -0.7],
                     [0.0, 1.0, 0.0, -0.203],
                     [0.12, 0.939, -0.321, 0.25],
                     [0.0, 0.874, 0.486, -0.575],
                     [-0.12, 0.939, -0.321, 0.961]],
                    np.float64)
>>> init.shape[0] == 8  
True  # n_clusters
>>> init.shape[1] == actual_data_points.shape[1]
True  # n_features

What is n_features?

n_features是样本的维数。例如，如果要在二维平面上聚集点，n_features将是2。

相关问题更多 >

编程相关推荐

热门问题

热门文章