我使用python和大约4000个手表图像(示例:watch_1,watch_2)。图像为rgb,分辨率为450x450。我的目标是在其中找到最相似的手表。为此,我使用scikit_learn
的^{
import cv2
import numpy as np
import os
from glob import glob
from sklearn.decomposition import IncrementalPCA
from sklearn import neighbors
from sklearn import preprocessing
data = []
# Read images from file #
for filename in glob('Watches/*.jpg'):
img = cv2.imread(filename)
height, width = img.shape[:2]
img = np.array(img)
# Check that all my images are of the same resolution
if height == 450 and width == 450:
# Reshape each image so that it is stored in one line
img = np.concatenate(img, axis=0)
img = np.concatenate(img, axis=0)
data.append(img)
# Normalise data #
data = np.array(data)
Norm = preprocessing.Normalizer()
Norm.fit(data)
data = Norm.transform(data)
# IncrementalPCA model #
ipca = IncrementalPCA(n_components=6)
length = len(data)
chunk_size = 4
pca_data = np.zeros(shape=(length, ipca.n_components))
for i in range(0, length // chunk_size):
ipca.partial_fit(data[i*chunk_size : (i+1)*chunk_size])
pca_data[i * chunk_size: (i + 1) * chunk_size] = ipca.transform(data[i*chunk_size : (i+1)*chunk_size])
# K-Nearest neighbours #
knn = neighbors.NearestNeighbors(n_neighbors=4, algorithm='ball_tree', metric='minkowski').fit(data)
distances, indices = knn.kneighbors(data)
print(indices)
但是,当我用40个手表图像运行这个程序时,当i = 1
时,我得到了以下错误:
然而,很明显,我在编码ipca = IncrementalPCA(n_components=6)
时将n_components
设置为6,但由于某些原因,ipca
将{i = 0
时的组件数,然后当{
为什么会这样?在
我怎样才能修好它?在
这似乎遵循了PCA背后的数学原理,因为它对
n_components > n_samples
是病态的。在您可能对阅读this(错误消息简介)和some discussion behind it感兴趣。在
尝试增加批处理大小/块大小(或降低n_组件)。在
(总的来说,我对这种方法也有点怀疑。我希望您使用批处理PCA在一些小的示例数据集上测试它。你的手表似乎没有在几何方面进行预处理:裁剪;可能是历史/颜色标准化。)
相关问题 更多 >
编程相关推荐