我认为特征向量必须相互正交。以下内容似乎违反了这一点。我想检查一下我是否做错了什么。谢谢你的任何见解
以下是PCA的代码(文章底部的数据)
from numpy import array
from numpy import mean
from numpy import cov
from numpy.linalg import eig
#calculate the mean of each column
M = mean(df.T, axis=1)
# center columns by subtracting column means
C = df - M
# calculate covariance matrix of centered matrix
V = cov(df.T)
# eigendecomposition of covariance matrix
values, vectors = eig(V)
# project data
P = vectors.T.dot(C.T)
#Make a list of (eigenvalue, eigenvector) tuples
eig_pairs = [(np.abs(values[i]), vectors[:,i]) for i in range(len(values))]
# Sort the (eigenvalue, eigenvector) tuples from high to low
eig_pairs.sort(key=lambda x: x[0], reverse=True)
matrix_w = np.hstack((eig_pairs[0][1].reshape(20,1), eig_pairs[1][1].reshape(20,1)))
#print('Matrix W:\n', matrix_w)
我在这里绘制特征向量所做的就是抓取矩阵的前两行。这是正确的吗?我只是手动将它们输入到数组M中。我的矩阵是错误的还是前两个主分量的向量选择不正确
M = np.array([[0.00747255, 0.16222854],[-0.18394907, 0.12426324]])
rows,cols = M.T.shape
#Get absolute maxes for axis ranges to center origin
maxes = 1.1*np.amax(abs(M), axis = 0)
for i,l in enumerate(range(0,cols)):
xs = [0,M[i,0]]
ys = [0,M[i,1]]
plt.plot(xs,ys)
plt.plot(0,0,'ok') #<-- plot a black point at the origin
plt.axis('equal') #<-- set the axes to the same scale
plt.legend(['V'+str(i+1) for i in range(cols)]) #<-- give a legend
plt.grid(b=True, which='major') #<-- plot grid lines
plt.show()```
这是绘制的向量的样子,但它们不是正交的
以下是数据(已经规范化了np.log):
[[1.954242509439325,
1.6901960800285136,
1.9444826721501687,
1.2787536009528289,
1.7558748556724915,
1.7075701760979363,
1.2787536009528289,
1.3222192947339193,
1.4313637641589874,
1.3222192947339193,
1.9084850188786497,
1.8750612633917,
1.6434526764861874,
1.8512583487190752,
1.3424226808222062,
1.9590413923210936,
1.9294189257142926,
1.8692317197309762,
1.4771212547196624,
1.414973347970818],
[1.9138138523837167,
1.0,
1.7781512503836436,
0.3010299956639812,
1.7403626894942439,
1.6127838567197355,
0.47712125471966244,
0.3010299956639812,
0.6020599913279624,
0.3010299956639812,
1.8260748027008264,
1.8512583487190752,
0.9542425094393249,
1.662757831681574,
1.9030899869919435,
1.8195439355418688,
1.380211241711606,
1.9731278535996986,
0.6989700043360189,
1.255272505103306],
[1.9444826721501687,
1.6232492903979006,
1.7993405494535817,
0.6020599913279624,
1.8808135922807914,
1.724275869600789,
1.0413926851582251,
1.3617278360175928,
1.0413926851582251,
0.6989700043360189,
1.9395192526186185,
1.9242792860618816,
1.6020599913279623,
1.6532125137753437,
1.9444826721501687,
1.9731278535996986,
1.6720978579357175,
1.5563025007672873,
1.7558748556724915,
0.47712125471966244],
[1.9822712330395684,
1.792391689498254,
1.9912260756924949,
1.505149978319906,
1.792391689498254,
1.8260748027008264,
1.6334684555795864,
0.8450980400142568,
1.146128035678238,
1.146128035678238,
1.919078092376074,
1.9493900066449128,
1.7853298350107671,
1.9084850188786497,
1.1760912590556813,
1.4913616938342726,
1.9867717342662448,
1.1139433523068367,
1.724275869600789,
1.1760912590556813],
[1.9731278535996986,
1.5797835966168101,
1.6812412373755872,
1.0413926851582251,
1.8692317197309762,
1.568201724066995,
1.3617278360175928,
0.9542425094393249,
1.1139433523068367,
1.0791812460476249,
1.8808135922807914,
1.8808135922807914,
1.6232492903979006,
1.7558748556724915,
1.462397997898956,
1.9242792860618816,
1.9030899869919435,
1.919078092376074,
1.3010299956639813,
0.6989700043360189],
[1.9867717342662448,
1.7853298350107671,
1.9344984512435677,
1.4471580313422192,
1.8976270912904414,
1.863322860120456,
1.0791812460476249,
0.8450980400142568,
1.414973347970818,
1.3617278360175928,
1.9294189257142926,
1.9731278535996986,
1.919078092376074,
1.3010299956639813,
1.9590413923210936,
1.9731278535996986,
1.9731278535996986,
1.9242792860618816,
1.4913616938342726,
1.380211241711606],
[1.4313637641589874,
1.9344984512435677,
1.99563519459755,
1.3424226808222062,
1.9590413923210936,
1.7403626894942439,
1.8808135922807914,
1.2304489213782739,
1.3010299956639813,
1.380211241711606,
1.8808135922807914,
1.8325089127062364,
1.9493900066449128,
1.9590413923210936,
1.0413926851582251,
1.9777236052888478,
1.9731278535996986,
1.7558748556724915,
1.0413926851582251,
1.4471580313422192],
[1.8573324964312685,
1.414973347970818,
1.8864907251724818,
0.3010299956639812,
1.3424226808222062,
1.5314789170422551,
0.0,
0.6989700043360189,
1.3010299956639813,
0.47712125471966244,
1.3424226808222062,
1.7075701760979363,
0.9030899869919435,
1.2041199826559248,
1.9493900066449128,
1.8129133566428555,
1.8920946026904804,
1.9637878273455553,
0.7781512503836436,
0.9542425094393249],
[1.7403626894942439,
1.4913616938342726,
1.7853298350107671,
1.1760912590556813,
1.462397997898956,
1.5185139398778875,
0.0,
0.6989700043360189,
1.1760912590556813,
1.0413926851582251,
1.6901960800285136,
1.6232492903979006,
1.146128035678238,
1.6127838567197355,
1.7075701760979363,
1.7075701760979363,
1.8573324964312685,
1.4471580313422192,
1.1139433523068367,
1.0413926851582251],
[1.863322860120456,
1.8573324964312685,
1.9294189257142926,
1.3979400086720377,
1.4913616938342726,
1.8388490907372552,
1.0,
1.2304489213782739,
1.2787536009528289,
1.1760912590556813,
1.8976270912904414,
1.845098040014257,
1.662757831681574,
1.7853298350107671,
1.806179973983887,
1.9138138523837167,
1.6812412373755872,
1.7853298350107671,
1.6812412373755872,
1.4771212547196624],
[1.9822712330395684,
1.2304489213782739,
1.9637878273455553,
1.5440680443502757,
1.8195439355418688,
1.505149978319906,
1.2304489213782739,
1.0413926851582251,
1.7075701760979363,
1.6232492903979006,
1.9084850188786497,
1.8573324964312685,
1.6989700043360187,
1.806179973983887,
1.0413926851582251,
1.9637878273455553,
1.9590413923210936,
1.4771212547196624,
1.0413926851582251,
1.5314789170422551],
[1.9637878273455553,
1.2304489213782739,
1.919078092376074,
1.1139433523068367,
1.792391689498254,
1.7075701760979363,
0.6020599913279624,
1.2304489213782739,
1.4771212547196624,
1.1760912590556813,
1.7853298350107671,
1.8573324964312685,
1.5314789170422551,
1.7075701760979363,
1.0413926851582251,
1.7993405494535817,
1.9731278535996986,
1.4471580313422192,
0.3010299956639812,
1.792391689498254],
[1.4771212547196624,
1.7160033436347992,
1.99563519459755,
1.0413926851582251,
1.9030899869919435,
1.8750612633917,
1.255272505103306,
0.3010299956639812,
0.6989700043360189,
0.47712125471966244,
1.7558748556724915,
1.7160033436347992,
1.662757831681574,
1.9493900066449128,
0.6989700043360189,
1.9867717342662448,
1.3979400086720377,
1.4913616938342726,
0.47712125471966244,
0.9542425094393249]]
df = pd.DataFrame(data, columns=['Real coffee', 'Instant coffee', 'Tea', 'Sweetener', 'Biscuits',
'Powder soup', 'Tin soup', 'Potatoes', 'Frozen fish', 'Frozen veggies',
'Apples', 'Oranges', 'Tinned fruit', 'Jam', 'Garlic', 'Butter',
'Margarine', 'Olive oil', 'Yoghurt', 'Crisp bread'])
检查两个向量的正交性的一个简单方法是看是否有点积为零。在您的例子中,正交向量应该是
vectors
的列(即协方差矩阵的特征向量)。例如,应在不引发错误的情况下运行以下命令一个更快的方法是记住矩阵积只是第一个矩阵的行与第二个矩阵的列的点积,即
vectors.T @ vectors
。然后,我们要检查这个结果的下三角(不包括对角线)(与循环中的if col_i < col_j
相同的区域)是否都为零:这应该返回
True
你的绘图看起来不正交的原因是你取了两个20D向量,然后任意地将它们投影到2D。这样做时,无法保证它们将保持正交。作为一个例子,考虑常见的XYZ轴图:
您知道z轴与x轴正交,但如果将其向下投影到2D,则所看到的角度取决于投影角度,不再正交
相关问题 更多 >
编程相关推荐