在Keras中训练时出现无效的参数错误

2024-03-29 13:03:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用keras制作电影推荐模型:

import pandas as pd
from sklearn.model_selection import train_test_split
import keras
from keras.layers import Input, Embedding, Dot, Flatten

rating = pd.read_csv("./ratings.csv",usecols=[0,1,2])
users = len(rating.userId.unique())
movies = len(rating.movieId.unique())
embed_size = 3
train, test = train_test_split(rating, test_size=0.2)

movie_input = Input(shape=[1], name="movie_in")
movie_embed = Embedding(movies, embed_size, name="movie_embed")(movie_input)
movie_vector = Flatten(name="flatten_movies")(movie_embed)

user_input = Input(shape=[1], name="user_in")
user_embed = Embedding(users, embed_size, name="user_embed")(user_input)
user_vector = Flatten(name="flatten_users")(user_embed)

prod = Dot(axes=-1, name="dot-product")([movie_vector, user_vector])

model = keras.Model(inputs=[user_input, movie_input], outputs=prod)
model.compile(optimizer='adam', loss='mse')
model.fit(x=[train.userId, train.movieId], y=train.rating,epochs=10, 
verbose=0)

当我尝试训练模型时,出现以下错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 
indices[15,0]= 7438 is not in [0, 5000)

[[{{node movie_embed/embedding_lookup}} = GatherV2[Taxis=DT_INT32, 
Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/Adam/Assign_2"], 
_device="/job:localhost/replica:0/task:0/device:CPU:0"] 
(movie_embed/embeddings/read, movie_embed/Cast, 
training/Adam/gradients/movie_embed/embedding_lookup_grad/concat/axis)]]

但是大多数在线教程都使用相同的代码,对他们来说是正常的。你知道吗


Tags: nametestimportinputsizemodeltrainembed
1条回答
网友
1楼 · 发布于 2024-03-29 13:03:04

您的movie_embed嵌入层(基本上是一个查找表)有5000行,因此它需要0到5000之间的整数作为输入。您输入的是7438,这导致了错误。在rating.movieId中可能有5000个唯一值,但显然也有一些值超出了[0, 5000)区间。您需要将train.userId整数映射到这个间隔上才能使它工作。你知道吗

相关问题 更多 >