如何在普通的笔记本电脑上成功地运行一个带有中等大小数据集的ML算法？

import numpy as np import sys from modshogun import LMNN, RealFeatures, MulticlassLabels from sklearn.datasets import load_svmlight_file def main(): # Get training file name from the command line traindatafile = sys.argv[1] # The training file is in libSVM format tr_data = load_svmlight_file(traindatafile); Xtr = tr_data[0].toarray(); # Converts sparse matrices to dense Ytr = tr_data[1]; # The trainig labels # Cast data to Shogun format to work with LMNN features = RealFeatures(Xtr.T) labels = MulticlassLabels(Ytr.astype(np.float64)) # Number of target neighbours per example - tune this using validation k = 18 # Initialize the LMNN package lmnn = LMNN(features, labels, k) init_transform = np.eye(Xtr.shape[1]) # Choose an appropriate timeout lmnn.set_maxiter(200000) lmnn.train(init_transform) # Let LMNN do its magic and return a linear transformation # corresponding to the Mahalanobis metric it has learnt L = lmnn.get_linear_transform() M = np.matrix(np.dot(L.T, L)) # Save the model for use in testing phase # Warning: do not change this file name np.save("model.npy", M) if __name__ == '__main__': main()

1条回答

网友

1楼 · 发布于 2024-04-25 17:01:44

精确k-NN存在可伸缩性问题。你知道吗

Scikit learn有documentation page（缩放策略）来说明在这种情况下应该做什么（许多算法有partial_fit方法，但不幸的是kNN没有它）。你知道吗

如果你愿意用精确度来换取速度，你可以运行类似approximate nearest neighbors的程序。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章