在多元回归中使用单一变量进行预测是可能的吗？

import numpy as np from sklearn.linear_model import LinearRegression x = [[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]] y = [4, 5, 20, 14, 32, 22, 38, 43] x, y = np.array(x), np.array(y) model = LinearRegression().fit(x, y) test_x = np.array([5, 20, 14, 32, 22, 38]) model.predict(test_x.reshape(-1,1))

2条回答

网友

1楼 · 编辑于 2024-04-26 14:17:24

线性回归的目的是找出输入值和输出值之间的线性关系

基本上，它是：{}你的预测，{}模型参数（通过训练微调），{}你的输入和{}一个误差系数。培训的目的是找到最佳的θ和Ɛ，使您的预测尽可能准确

To illustrate with a picture, θ and Ɛ are the red curve

您不能训练具有特定维度（输入和输出）的线性回归模型，并使用另一维度进行预测：

在您的示例中，您谈到了^{，在公式中，a（2,1）矩阵是^{，用于确定价格^{，它是一个标量。所以，θ应该是一个（1,2）矩阵和Ɛ一个标量

如果您只想使用价格或马力，则必须为每种输入创建不同的模型

网友

2楼 · 编辑于 2024-04-26 14:17:24

特征矩阵中的每个观察值由2个值组成（针对2个特征）。您试图一次传递6个值，而不是将这6个值分成3个数组，每个数组由2个值组成（表示数据中的观察值）

import numpy as np
from sklearn.linear_model import LinearRegression

x = [[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]]
y = [4, 5, 20, 14, 32, 22, 38, 43]
x, y = np.array(x), np.array(y)

model = LinearRegression().fit(x, y)

test_x = np.array([[5, 20], [14, 32], [22, 38]])
model.predict(test_x)

我可以为您所需的方法建议两种方法：

您可以对预测输出时不希望使用的列使用零值
您可以根据所需的功能对模型进行训练

"""create dummy data"""

import pandas as pd
import numpy as np

# construct a few features
features = np.array([[2, 2],
                     [4, 6],
                     [9, 1],
                     [6, 2]])

# construct a target
target = np.array([15, 20, 50, 18])

# construct a dataframe
dataframe = pd.DataFrame()

dataframe['Price'] = features[:, 0]

dataframe['HorsePower'] = features[:, 1]

dataframe['Cost'] = target

# p.s. I've used the long method to construct my dataframe, you may pass data using the 'data' parameter.
print(dataframe)
print(' ')

# separate features matrix and target vector
features = dataframe.iloc[:, 0:2]
target = dataframe.iloc[:, -1]

# import package
from sklearn.linear_model import LinearRegression

# create instance of LR
algorithm = LinearRegression()

# train the model on both features
model = algorithm.fit(features, target)

# view parameters and hyperparameters
print(model)

# create observation passing values for both features
observation = [[9, 1]]

# obtain predictions
predictions = model.predict(observation)

# print prediction
print(predictions)

plt.scatter(dataframe.index, target, color='crimson', marker='v', edgecolors='black', label='Target_Value')
plt.scatter(dataframe.index, model.predict(features), color='silver', marker='d', edgecolors='black', label='Predicted_Value')
plt.title('Scatter Plot Showing Predicted Target Values Vs Actual Target Values', color='blue')
plt.xlabel('Observation Number', color='blue')
plt.ylabel('Value', color='blue')
plt.legend(numpoints=1, loc='best')
plt.show()

# train model, this time on desired feature (s)
model = algorithm.fit(np.array(features.iloc[:, 0]).reshape(-1, 1), 
    target)

# obtain prediction
prediction = model.predict([[2]])

# print predictions
print(prediction)

相关问题更多 >

编程相关推荐

热门问题

热门文章