如何建立一个具有多个文本特征（重要性/权重不同）的基于内容的推荐系统？

2024-04-25 22:19:48 发布

男 | 程序猿一只，喜欢编程写python代码。

我正在尝试建立一个简单的基于内容的系统，根据客户以前住过的酒店向他们推荐度假酒店。我的DataFrame有几个以属性名开头的列。然后我有4个专栏，我想能够'读'和建议的建议。这些列是属性描述（长文本）、属性标语（单行描述）、属性类型（例如“带私人游泳池的两居室别墅”）和属性目的地（位置）。所有这4项功能都是重要的，使知情和教育的建议，但不是所有4个同样重要！如何区分我的特征的权重？例如，物业类型和位置的相似性比描述和/或标语的相似性更重要。你知道吗

我已经到了这样一个地步：系统会考虑所有不同的特性，但不会真正区分它们，看一个特性是否比另一个更重要。你知道吗

df['corpus'] = (pd.Series(df[['property_destination', 'proptype', 'property_taglines', 'property_description']].fillna('').values.tolist()).str.join(' '))
tfidf = TfidfVectorizer (analyzer = 'word', ngram_range=(1, 3), stop_words= 'english')

tfidf_matrix = tfidf.fit_transform(df['corpus'])

cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

indices = pd.Series(df.index, index = df['property_name']).drop_duplicates()


# Function that takes in Accommodation as input and outputs most similar properties

def recommend(property_name, cosine_sim = cosine_sim):

        # Get the index of the Accommodation that matches the name
        idx = indices[property_name]

        # Get the pairwise similarity scores of all properties with that property
        sim_scores = list(enumerate(cosine_sim[idx]))

        # Sort the properties based on the similarity scores
        sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse = True)

        # Get the scores of the 3 most similar properties
        sim_scores = sim_scores[1:4]
        print(sim_scores)

        # Get the property indices
        property_indices = [i[0] for i in sim_scores]

        # Return the top 3 most similar properties
        return df['property_name'].iloc[property_indices]


recommend("PROPERTY NAME")

[(13, 0.11228911954301364), (25, 0.07511964503440056), (22, 0.07428739026394662), (18, 0.07371643598349838)]

这是一个4个属性的建议，以及它们的索引和相似性分数（到输入属性）。你知道吗

还不错。我可以根据它实际推荐的属性来判断，因为我个人知道所有这些属性（超过200个）-但我相信它可以变得更好，随着库存规模的增加，这将非常有帮助。。。你知道吗

我怎样才能给我的特性“属性描述”、“属性标记线”、“属性目的地”和“属性类型”添加某种权重呢？！你知道吗

Tags： the name 类型 df get 属性 property sim

0条回答

目前没有回答

如何建立一个具有多个文本特征（重要性/权重不同）的基于内容的推荐系统？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何建立一个具有多个文本特征（重要性/权重不同）的基于内容的推荐系统？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >