基于列值创建组

user_id total_usage 1 10 2 10 3 20 4 20 5 30 6 30 7 40 8 40 9 50 10 50 11 60 12 60 13 70 14 70 15 80 16 80 17 90 18 90 19 100 20 100

user_id total_usage user_group 1 10 10th_group 2 10 10th_group 3 20 9th_group 4 20 9th_group 5 30 8th_group 6 30 8th_group 7 40 7th_group 8 40 7th_group 9 50 6th_group 10 50 6th_group 11 60 5th_group 12 60 5th_group 13 70 4th_group 14 70 4th_group 15 80 3th_group 16 80 3th_group 17 90 2nd_group 18 90 2nd_group 19 100 1st_group 20 100 1st_group

3条回答

网友

1楼 · 编辑于 2024-05-14 04:30:06

用^{}改变负片的顺序，用^{}表示1.st和2.nd值：

s =  pd.qcut(-df['total_usage'], np.arange(0,1.1, 0.1), labels=False) + 1
d = {1:'st', 2:'nd'}
df['user_group'] = s.astype(str) + s.map(d).fillna('th') + '_group'
print (df)
    user_id  total_usage  user_group
0         1           10  10th_group
1         2           10  10th_group
2         3           20   9th_group
3         4           20   9th_group
4         5           30   8th_group
5         6           30   8th_group
6         7           40   7th_group
7         8           40   7th_group
8         9           50   6th_group
9        10           50   6th_group
10       11           60   5th_group
11       12           60   5th_group
12       13           70   4th_group
13       14           70   4th_group
14       15           80   3th_group
15       16           80   3th_group
16       17           90   2nd_group
17       18           90   2nd_group
18       19          100   1st_group
19       20          100   1st_group

网友

2楼 · 编辑于 2024-05-14 04:30:06

尝试将pd.Series与np.repeat、np.arange、pd.DataFrame.groupby、pd.Series.astype、pd.Series.map和pd.Series.fillna一起使用：

x = df.groupby('total_usage')
s = pd.Series(np.repeat(np.arange(len(x.ngroups), [len(i) for i in x.groups.values()]) + 1)
df['user_group'] = (s.astype(str) + s.map({1: 'st', 2: 'nd'}).fillna('th') + '_Group').values[::-1]

现在：

print(df)

是：

    user_id  total_usage  user_group
0         1           10  10th_Group
1         2           10  10th_Group
2         3           20   9th_Group
3         4           20   9th_Group
4         5           30   8th_Group
5         6           30   8th_Group
6         7           40   7th_Group
7         8           40   7th_Group
8         9           50   6th_Group
9        10           50   6th_Group
10       11           60   5th_Group
11       12           60   5th_Group
12       13           70   4th_Group
13       14           70   4th_Group
14       15           80   3th_Group
15       16           80   3th_Group
16       17           90   2nd_Group
17       18           90   2nd_Group
18       19          100   1st_Group
19       20          100   1st_Group

网友

3楼 · 编辑于 2024-05-14 04:30:06

看起来您正在查找qcut，但顺序相反

df['user_group'] = 10 - pd.qcut(df['total_usage'], np.arange(0,1.1, 0.1)).cat.codes

输出，它不是序数，但我希望它能：

0     10
1     10
2      9
3      9
4      8
5      8
6      7
7      7
8      6
9      6
10     5
11     5
12     4
13     4
14     3
15     3
16     2
17     2
18     1
19     1
dtype: int8

相关问题更多 >

编程相关推荐

热门问题

热门文章