用Gram矩阵实现RBF核的Python实现?

2024-03-29 14:48:29 发布

您现在位置:Python中文网/ 问答频道 /正文

Python的信息和precomputed kernels示例非常有限。sklearn 仅提供linear kernel的一个小例子:http://scikit-learn.org/stable/modules/svm.html

下面是线性核的代码:

import numpy as np
from scipy.spatial.distance import cdist
from sklearn.datasets import load_iris

# import data
iris = datasets.load_iris()
X = iris.data                    
Y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, Y)

clf = svm.SVC(kernel='precomputed')

# Linear kernel
G_train = np.dot(X_train, X_train.T) 
clf.fit(G_train, y_train) 

G_test = np.dot(X_test, X_train.T)
y_pred = clf.predict(G_test)    

这对于进一步理解其他非平凡内核的实现没有很大帮助,例如,RBF kernel,它将是:

^{pr2}$

如何对traintest进行相同的拆分并实现precomputed kernel的{}?在

如果内核变得更加复杂,这取决于需要在单独的函数中计算的其他参数,例如,对于参数alpha >= 0

K(X, X') = alpha('some function depending on X_train, X_test')*np.exp(divide(-cdist(X, X, 'euclidean), 2*np.std(X**2)))

我们需要这些非平凡内核的例子。如有任何建议,我将不胜感激。在


Tags: fromtestimportirisnptrainsklearnkernel
1条回答
网友
1楼 · 发布于 2024-03-29 14:48:29

我们可以手工编写内核pca。让我们从政治核开始。在

from sklearn.datasets import make_circles
from scipy.spatial.distance import pdist, squareform
from scipy.linalg import eigh
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

X_c, y_c = make_circles(n_samples=100, random_state=654)

plt.figure(figsize=(8,6))

plt.scatter(X_c[y_c==0, 0], X_c[y_c==0, 1], color='red')
plt.scatter(X_c[y_c==1, 0], X_c[y_c==1, 1], color='blue')


plt.ylabel('y coordinate')
plt.xlabel('x coordinate')

plt.show()

数据:
Data

^{pr2}$

现在转换数据并绘制它

X_c1 = degree_pca(X_c, gamma=5, degree=2, n_components=2)

plt.figure(figsize=(8,6))

plt.scatter(X_c1[y_c==0, 0], X_c1[y_c==0, 1], color='red')
plt.scatter(X_c1[y_c==1, 0], X_c1[y_c==1, 1], color='blue')


plt.ylabel('y coordinate')
plt.xlabel('x coordinate')

plt.show()

线性可分:
Linearly separable

现在点可以线性分开。在

接下来我们编写RBF核函数。为了演示,让我们来看看月亮。在

from sklearn.datasets import make_moons
X, y = make_moons(n_samples=100, random_state=654)

plt.figure(figsize=(8,6))

plt.scatter(X[y==0, 0], X[y==0, 1], color='red')
plt.scatter(X[y==1, 0], X[y==1, 1], color='blue')


plt.ylabel('y coordinate')
plt.xlabel('x coordinate')

plt.show()

月亮:
Moons

核pca变换:

def stepwise_kpca(X, gamma, n_components):
    """
    X: A MxN dataset as NumPy array where the samples are stored as rows (M), features as columns (N).
    gamma: coefficient for the RBF kernel.
    n_components: number of components to be returned.

    """
    # Calculating the squared Euclidean distances for every pair of points
    # in the MxN dimensional dataset.
    sq_dists = pdist(X, 'sqeuclidean')

    # Converting the pairwise distances into a symmetric MxM matrix.
    mat_sq_dists = squareform(sq_dists)

    K=np.exp(-gamma*mat_sq_dists)

    # Centering the symmetric NxN kernel matrix.
    N = K.shape[0]
    one_n = np.ones((N,N)) / N
    K = K - one_n.dot(K) - K.dot(one_n) + one_n.dot(K).dot(one_n)

    # Obtaining eigenvalues in descending order with corresponding
    # eigenvectors from the symmetric matrix.
    eigvals, eigvecs = eigh(K)

    # Obtaining the i eigenvectors that corresponds to the i highest eigenvalues.
    X_pc = np.column_stack((eigvecs[:,-i] for i in range(1,n_components+1)))

    return X_pc

我们来策划一下

X_4 = stepwise_kpca(X, gamma=15, n_components=2)

plt.scatter(X_4[y==0, 0], X_4[y==0, 1], color='red')
plt.scatter(X_4[y==1, 0], X_4[y==1, 1], color='blue')


plt.ylabel('y coordinate')
plt.xlabel('x coordinate')

plt.show()

结果:
rbf pca

相关问题 更多 >