import pandas as pd
# Setting the data
all_rows = [["7D","Sara","8A"],
["8A","Rosa","Na"],
["4D","Jess","8A"],
["6B","Veronica","Na"],
["8L","Sophia","6B"],
["7N","iria","Na"],
["7D","Sara","8A"],
["8A","Rosa","Na"]]
df = pd.DataFrame(all_rows, columns=["registered","name","daughter_of"])
# Df aux
df_grouped = df.drop_duplicates().groupby(["daughter_of"])["daughter_of"].count().reset_index(name="children")
# Renaming columns so the join is made correctly
df_grouped.columns = ["registered", "children"]
# Joining
df = pd.merge(df,df_grouped[df_grouped["registered"]!="Na"],on=["registered"],how='left')
这是我收到的输出
registered name daughter_of children
0 7D Sara 8A NaN
1 8A Rosa Na 2.0
2 4D Jess 8A NaN
3 6B Veronica Na 1.0
4 8L Sophia 6B NaN
5 7N iria Na NaN
6 7D Sara 8A NaN
7 8A Rosa Na 2.0
如果原始数据存储在数据帧(
df
)中,则可以使用:您可以创建一个按每个“已注册”的“子项数”分组的辅助数据框,以便以后将其与原始数据框合并。其内容如下:
这是我收到的输出
“已注册”字段只考虑一次行数
相关问题 更多 >
编程相关推荐