我是python新手,试图找出我在代码中的何处创建索引0?我应该如何修复它来解析代码,并将50部相关电影展示给《阿凡达》。我知道错误可能在我的“get_index_from_title”函数中,但我不确定如何解决它
def get_title_from_index(index):
return df[df.index == index]["title"].values[0]
def get_index_from_title(title):
return df[df.title == title]["index"].values[0]
##################################################
# Step 1. Read CSV File
df = pd.read_csv("movies.csv",quoting=3, error_bad_lines=False)
#Step2: Select Features
features = ['keywords', 'cast','genres','director']
#Step 3: Create a solumn in DF which combines all selected features
for feature in features:
df[feature] = df[feature].fillna('')
def combine_features(row):
try:
return row['keywords'] +" "+row['cast']+" "+row["genres"] +" "+row["director"]
except:
print("Error:", row)
df["combined_features"] = df.apply(combine_features,axis=1)
# print("Combined Features:", df["combine_features"].head())
#Step 4: Create count matrix from this new combined column
cv = CountVectorizer()
count_matrix = cv.fit_transform(df["combined_features"])
#Step 5: Compute the Cpsine Similarity based on the count_matrix
cosine_sim = cosine_similarity(count_matrix)
movie_user_likes = "Avatar"
#Step 6: Get index of this movie from its title
movie_index = get_index_from_title(movie_user_likes)
similar_movies = list(enumerate(cosine_sim[movie_index]))
#Step 7: Get a list of similar movies in dscending order of similarity score
sorted_similar_movies = sorted(similar_movies, key= lambda x:x[1], reverse=True)
#Step8: Print title of first 50 movies
i = 0
for movie in sorted_similar_movies:
print (get_title_from_index(movie[0]))
i= i+1
if i>50:
break
试试看
这是从给定索引和列的数据帧中选择值的正确方法
相关问题 更多 >
编程相关推荐