import pandas as pd
# I will create a data frame from a dictionary for this example
dict_df = {
"Code": ["A","B","C","D","C","B","B","B","A","A"],
"Age": [14, 16, 17, 4, 15, 16, 8, 10, 90, 99],
"Sex": [0, 1, 1, 1, 0, 0, 0, 0, 0, 1]
}
data = pd.DataFrame.from_dict(dict_df)
# Group by column code
data_bycode = data.groupby(["Code"]).size()
# Sort data_bycode in decreasing order
data_bycode.sort_values(ascending = False, inplace = True)
data_bycode
另一种方法是从collections中提取感兴趣的列并使用Counter
from collections import Counter
# Collect data into a list
codes = data["Code"].tolist()
# Get ferquencies with Counter and transform it as a dict
freq_codes = dict(Counter(codes))
# Get a dictionary to create a data frame with columns Code and Count
dict_df = {"Code": [], "Count": []}
for key, value in freq_codes.items():
dict_df["Code"].append(key)
dict_df["Count"].append(value)
# Create df from dictionary
df = pd.DataFrame.from_dict(dict_df)
# Sort values in df
df.sort_values(ascending = False, inplace = True, by = "Count") # Neeeded here because we have more than one column
df
这将按代码和性别添加死亡人数,并在每个类别中创建一个计数。然后按代码和性别分组,按死亡人数降序排列
这里有两个例子可以帮助您:
另一种方法是从
collections
中提取感兴趣的列并使用Counter
我希望它能有用:)
第一步是创建一个你正在寻找的代码列表,然后使用一个掩码在上面过滤你的数据帧
然后,听起来你想做的是将数据按死因分组,并将该死因的总死亡人数相加:
然后,您可以使用sortby()将数据帧从高到低排列,并对数据帧进行切片以保留前十位
希望这有帮助
更新: 在我的机器上尝试了一个玩具样品,结果如下:
相关问题 更多 >
编程相关推荐