python计算数据帧agg

2024-04-25 00:42:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含UserIDSharedNews的数据帧,我想计算每个用户有多少共享新闻。这是我的密码:

import pandas as pd
import numpy as np
...

def aggr_new_userlevel_shares_dataset():
    new_userlevel_shares_df = new_userlevel_shares_dataset()
    id_shared_df = new_userlevel_shares_df[["UserID","PostTitle"]].values
    array_shared = []

    for row in id_shared_df:
        array_shared.append([row[0],sharedNews(row[1])])

    shared_df = pd.DataFrame(array_shared,columns = ["UserIDTemp","SharedNews"])
    concat_df = pd.concat([new_userlevel_shares_df,shared_df],axis = 1)
    concat_df.drop("UserIDTemp",axis = 1,inplace = True)
    print("before sum:")
    print(concat_df)

    concat_df = concat_df.groupby(["UserID"],sort = False).agg({"SharedNews",np.sum}).reset_index()
    print("after sum:")
    print(concat_df)

def sharedNews(post_title):
    countSharedNews = 0
    keywords = ['via', 'shared \'s', 'shared a', 'commented on', 'likes', 'published']
    for i in keywords:
        if (i in post_title and "photo" not in post_title) and (i in post_title and "video" not in post_title):
            countSharedNews = 1
    return countSharedNews 

但是,错误在于:

 Traceback (most recent call last):
  File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_hierarchicalClustering.py", line 747, in <module>
    aggr_new_userlevel_shares_dataset()
  File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_hierarchicalClustering.py", line 710, in aggr_new_userlevel_shares_dataset
    concat_df = concat_df.groupby(["UserID"],sort = False).agg({"SharedNews",np.sum}).reset_index()

    ...
    AttributeError: 'SeriesGroupBy' object has no attribute 'SharedNews'

你能告诉我原因和如何改正吗?你知道吗


Tags: indfnewtitlepostdatasetsharedpd