使用数组.重塑(1,1)如果您的数据只有一个特征

2024-04-29 07:48:28 发布

您现在位置:Python中文网/ 问答频道 /正文

如何使用指标.silouhette_分数在一个有1300个图像的数据集上,我有它们的ResNet50特征向量(每个长度为2048)和一个1到9之间的离散类标签?在

import pandas as pd
import numpy as np
from sklearn.metrics import pairwise_distances
from sklearn import cluster, datasets, preprocessing, metrics
from sklearn.cluster import KMeans
df = pd.read_csv("master.csv")
labels = list(df['Q3 Theme1'])
labels_reshaped = np.ndarray(labels).reshape(-1,1)
X = open('entire_dataset__resnet50_feature_vectors.txt')
X_Data = X.read()
print('Silhouette Score:', metrics.silhouette_score(X_Data, labels_reshaped,
                                                    metric='cosine'))

我得到这个错误:

^{pr2}$

对于其他代码:

import pandas as pd
import numpy as np
from sklearn.metrics import pairwise_distances
from sklearn import cluster, datasets, preprocessing, metrics
from sklearn.cluster import KMeans
df = pd.read_csv("master.csv")
labels = list(df['Q3 Theme1'])
labels_reshaped = np.ndarray(labels).reshape(1,-1)
X = open('entire_dataset__resnet50_feature_vectors.txt')
X_Data = X.read()
print('Silhouette Score:', metrics.silhouette_score(X_Data, labels_reshaped,
                                                    metric='cosine'))

我得到这个错误:

Traceback (most recent call last):
  File "/dataset/silouhette_score.py", line 8, in <module>
    labels_reshaped = np.ndarray(labels).reshape(1,-1)
ValueError: sequence too large; cannot be greater than 32

Process finished with exit code 1

如果我运行其他代码:

import pandas as pd
from sklearn import metrics
df = pd.read_csv("master.csv")
labels = list(df['Q3 Theme1'])
X = open('entire_dataset__resnet50_feature_vectors.txt')
X_Data = X.read()
print('Silhouette Score:', metrics.silhouette_score(X_Data, labels,
                                                    metric='cosine'))

我将其作为输出:https://pastebin.com/raw/hk2axdWL

如何修复此代码以便可以打印单个架构分数?在

Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Process finished with exit code 1

我已经粘贴了一行我的特征向量文件(一个.txt文件)在这里:https://pastebin.com/raw/hk2axdWL(由2048个数字组成,用空格分隔)


Tags: csvfromimportdfreaddatalabelsas
1条回答
网友
1楼 · 发布于 2024-04-29 07:48:28

我终于弄明白了。我需要创建与sklearn所需格式相同的特征向量:

import pandas as pd
from sklearn import metrics


df = pd.read_csv("master.csv")
labels = list(df['Q3 Theme1'])
X = open('entire_dataset__resnet50_feature_vectors.txt')
#X_Data = X.read()

fv = []
for line in X:
    line = line.strip("\n")
    tmp_arr = line.split(' ')
    print(tmp_arr)
    fv.append(tmp_arr)

print(fv)
print('Silhouette Score:', metrics.silhouette_score(fv, labels,
                                                    metric='cosine'))

相关问题 更多 >