我的感知器算法总是给我错误的分隔直线针对二维线性可分数据

0 投票

1 回答

34 浏览

提问于 2025-04-14 17:59

我正在为一些简单的二维数据实现一个感知器模型。下面是我生成数据的方式。

#numpoint
n = 15
#f(x) = w0 + ax1 + bx2
#then if f(x) = 0
#x2 = (-w0 - ax1)/b 
intercept = 30
a = 4
b = 2
#generate random points from 0 - 20
x1 = np.random.uniform(-20, 20, n) #return a np array
x2 = np.random.uniform(-20, 20, n)
y = []
#plot f(x)
plt.plot(x1, (-intercept - a*x1)/b, 'k-') 
plt.ylabel("x2")
plt.xlabel("x1")

#plot colored points
for i in range(0, len(x1)):
    f = intercept + a * x1[i] + b * x2[i]
    if (f <= 0):
        plt.plot(x1[i], x2[i], 'ro')
        y.append(-1)
    if (f > 0):
        plt.plot(x1[i], x2[i], 'bo')
        y.append(1)
y = np.array(y)
# Add x0 for threshold
x0 = np.ones(n)
stacked_x = np.stack((x0,x1,x2))
stacked_x

这是数据的可视化效果。

点击这里查看图片描述

这是我的感知器模型。

class PLA():
    def __init__(self, numPredictors):
        self.w = np.random.rand(1,numPredictors+1) #(1, numPredictors+1)
        self.iter = 0
    def fitModel(self, xData, yData):
        while(True): 
            yhat = np.matmul(self.w, xData).squeeze() #from(1,n) to (,n)
            compare = np.sign(yhat) == yData          
            ind = [i for i in range(0,len(compare)) if compare[i] == False] #misclassified index
            print(len(ind))
            if len(ind) == 0:    
                break
            for i in ind:
                update = yData[i]* xData[:, i] #1d array
                self.w = self.w + np.transpose(update[:,np.newaxis]) #tranpose to match weight's shape
            self.iter += 1

当我可视化这个模型时。

pla1 = PLA(2)
pla1.fitModel(stacked_x, y)
#plot colored points
for i in range(0, len(x1)):
    if (y[i] == -1):
        plt.plot(x1[i], x2[i], 'ro')
    if (y[i] == 1):
        plt.plot(x1[i], x2[i], 'bo')
plt.plot(x1, (-pla1.w[0][0] - pla1.w[0][1]*x1)/(pla1.w[0][1]), 'g-', label = "PLA")
plt.plot(x1, (-intercept - a*x1)/b, 'k-', label = "f(x)")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()

我从感知器算法得到的线是不正确的。

点击这里查看图片描述

这是使用不同数据参数和样本大小（n = 30）进行的另一次运行。

点击这里查看图片描述

我尝试在每次迭代时打印更新的内容，结果是按我预期的那样工作。但我不确定是什么原因导致我的算法停止，尽管仍然有一些点被错误分类。我在这个问题上卡了几天，真的很感谢任何建议。

数据可视化分类模型感知器算法二维数据线性可分迭代更新错误分类样本大小

1 个回答

我修改了代码，现在它可以正常工作了。我把系数分成了一个权重项和一个偏置项。对于你更新系数的方式我不是很确定，所以我把那部分改了，确保每次实例都能进行更新。

#numpoint
n = 300
#f(x) = w0 + ax1 + bx2
#then if f(x) = 0
#x2 = (-w0 - ax1)/b 
intercept = 30
a = 4
b = 2
#generate random points from 0 - 20
x1 = np.random.uniform(-20, 20, n) #return a np array
x2 = np.random.uniform(-20, 20, n)
y = []
#plot f(x)
plt.plot(x1, (-intercept - a*x1)/b, 'k--', label='ground truth') 
plt.ylabel("x2")
plt.xlabel("x1")

#plot colored points
for i in range(0, len(x1)):
    f = intercept + a * x1[i] + b * x2[i]
    if (f <= 0):
        plt.scatter(x1[i], x2[i], c='tab:red')
        y.append(-1)
    if (f > 0):
        plt.scatter(x1[i], x2[i], c='tab:blue')
        y.append(1)
y = np.array(y)
# Add x0 for threshold
stacked_x = np.row_stack((x1, x2))

class PLA():
    def __init__(self, numPredictors):
        self.w = np.random.rand(numPredictors)
        self.b = 0
        self.numPredictors = numPredictors
        
    def fitModel(self, xData, yData):
        n_errors = np.inf
        while(n_errors):
            n_errors = 0
            for xi, yi in zip(xData.T, yData.reshape(-1, 1)):
                linear = np.dot(xi, self.w) + self.b
                yhat = 1 if (linear > 0) else -1
                
                error = yi - yhat
                self.w = self.w + error * xi
                self.b = self.b + error
                
                if yhat != yi:
                    n_errors += 1 

pla1 = PLA(2)
pla1.fitModel(stacked_x, y)

plt.plot(x1, (-pla1.b - pla1.w[0]*x1)/pla1.w[1], 'g-', alpha=0.5, linewidth=5, label = "PLA")
plt.xlabel("x1")
plt.ylabel("x2")
plt.legend()
plt.gcf().set_size_inches(8, 3)
plt.ylim(-22, 22) #clip y limits

回答于 2025-04-14 由 Python大师

分享举报

我的感知器算法总是给我错误的分隔直线针对二维线性可分数据

1 个回答

撰写回答