我有一个数据集,如下所示:
df = pd.DataFrame([["1001","Category1"],["1001","Category1"],["1001","Category2"],["1002","Category1"],["1002","Category3"],["1001","Category3"],["1002", "Category2"],["1001", "Category3"],["1001","Category4"]], columns=['Id','Cat']))
Id Cat
0 1001 Category1
1 1001 Category1
2 1001 Category2
3 1002 Category1
4 1002 Category3
5 1001 Category3
6 1002 Category2
7 1003 Category3
8 1001 Category4
我想得到的是一种共现,就像下面这样,对于类别1中出现的每个ID,都有一个计数,这些ID在其他类别中出现的次数
df2 = pd.DataFrame([["nan","2","3","1"],["2","nan","2","1"],["3","2","nan","1"],["1","1","1","1"]],index = ['Category1','Category2','Category3'], columns=['Category1','Category2','Category3','Category4'])
Category1 Category2 Category3 Category4
Category1 2 2 2 1
Category2 2 2 2 1
Category3 2 2 3 1
Category4 1 1 1 1
我在考虑使用.groupby()
实现这一点,但我不确定如何像我的示例中那样获得每个类别
那能解决你的问题吗?你知道吗
相关问题 更多 >
编程相关推荐