基于df的余弦相似性

2024-05-29 10:39:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我想根据我做的df手动计算相似性

df_SimC = pd.DataFrame(df, columns = ['reviewerName','overall','Anger','Disgust','Fear','Joy','Sadness','Surprise'])

输出

overall  Anger     Disgust   Fear      Joy       Sadness   Surprise
1        0.229007  0.489583  0.190617  0.006204  0.075759  0.008829
4        0.001024  0.000020  0.052685  0.945093  0.000062  0.001116

现在我想做一个函数,它循环遍历df的所有行,并计算余弦相似性。所以我做了这个函数,但是由于某些原因,我得到了一些错误,我认为问题出在函数中(我现在返回或者函数无法从dataframe中获取行的值)

def SimC(nominator, denominator_Anger, denominator_Disgust, denominator_Fear, denominator_Joy, 
    denominator_Sadness, denominator_Surprise):

    nominator, denominator_overall, denominator_Anger, denominator_Disgust, denominator_Fear, 
    denominator_Joy, denominator_Sadness, denominator_Surprise = 0, 0, 0, 0, 0, 0, 0, 0

    for i in range(len(df_SimC)):
        overall_sim = df_SimC.iloc[i]["overall"]
        print(overall_sim)
        Anger_sim = df_SimC.iloc[i]["Anger_sim"]
        Disgust_sim = df_SimC.iloc[i]["Disgust_sim"]
        Fear_sim = df_SimC.iloc[i]["Fear_sim"]
        Joy_sim = df_SimC.iloc[i]["Joy_sim"]
        Sadness_sim = df_SimC.iloc[i]["Sadness"]
        Surprise_sim = df_SimC.iloc[i]["Surprise"]

        denominator_overall += overall_sim * overall_sim
        print(denominator_overall)
        denominator_Anger += Anger_sim * Anger_sim
        denominator_Disgust += Disgust_sim * Disgust_sim
        denominator_Fear += Fear_sim * Fear_sim
        denominator_Joy += Joy_sim * Joy_sim
        denominator_Sadness += Sadness_sim * Sadness_sim
        denominator_Surprise += Surprise_sim * Surprise_sim

        nominator += denominator_overall * denominator_Anger * denominator_Disgust * denominator_Fear * 
        denominator_Joy * denominator_Sadness * denominator_Surprise

        return (nominator / sqrt(denominator_overall * denominator_Anger * denominator_Disgust * 
        denominator_Fear * denominator_Joy * denominator_Sadness * denominator_Surprise))

我得到的错误

TypeError: ("SimC() missing 6 required positional arguments: 'denominator_Anger', 'denominator_Disgust', 'denominator_Fear', 'denominator_Joy', 'denominator_Sadness', and 'denominator_Surprise'", 'occurred at index overall')

Tags: 函数dfsim相似性surprisesimcfearoverall

热门问题