ValueError:x和y必须是相同的大小

y=data['Yearly Amount Spent'] x=data[['Avg. Session Length','Time on App','Time on Website','Length of Membership','Yearly Amount Spent']] from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=101) #training the model from sklearn.linear_model import LinearRegression lm=LinearRegression() lm.fit(x_train,y_train) lm.coef_ predictions=lm.predict(X_test) #here the problem starts: plt.scatter(y_test,predictions)

1条回答

网友

1楼 · 发布于 2024-04-20 12:34:17

似乎您正在使用EcommerceCustomers.csv数据集（link here）

在您最初的帖子中，'Yearly Amount Spent'列也包含在y和x中，但这是错误的。你知道吗

以下操作应该可以正常工作：

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

data = pd.read_csv("EcommerceCustomers.csv")

y = data['Yearly Amount Spent']
X = data[['Avg. Session Length', 'Time on App','Time on Website', 'Length of Membership']]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)


# ## Training the Model
lm = LinearRegression()
lm.fit(X_train,y_train)

# The coefficients
print('Coefficients: \n', lm.coef_)

# ## Predicting Test Data
predictions = lm.predict( X_test)

另见this

相关问题更多 >

编程相关推荐

热门问题

热门文章