基于groupby Python的第一个和最后一个值的条件创建新列

date email level 01/01/2000 john@abc.com mgr 05/06/2000 john@abc.com mgr 10/01/2001 john@abc.com mgr 14/02/2000 kimdo@abc.com emp 19/10/2001 kimdo@abc.com mgr 12/05/2000 waint@abc.com emp 08/08/2000 waint@abc.com emp 14/04/2001 waint@abc.com emp 22/05/2000 neds@abc.com mgr 08/11/2000 neds@abc.com mgr 12/06/2001 neds@abc.com emp

date email level status 01/01/2000 john@abc.com mgr hired as mgr 10/01/2001 john@abc.com mgr hired as mgr 14/02/2000 kimdo@abc.com emp promoted to mgr 19/10/2001 kimdo@abc.com mgr promoted to mgr 12/05/2000 waint@abc.com emp hired as emp 14/04/2001 waint@abc.com emp hired as emp 22/05/2000 neds@abc.com mgr status change 12/06/2001 neds@abc.com emp status change

2条回答

网友

1楼 · 编辑于 2024-06-17 10:50:00

尝试创建一个map{}来映射状态

fl = lambda s: s.iloc[[0,-1]]
d = {'mgr-mgr': 'hired as mgr', 'emp-mgr': 'promoted to mgr', 'emp-emp': 'hired as emp', 'mgr-emp': 'status change'}
res = df.groupby('email', as_index=False)['level'].apply(lambda x: (fl(x).shift(1) + "-" + (fl(x))).bfill()).map(d)
res.index= res.index.droplevel()
df['status'] = res
df.dropna(inplace=True)

^{tb1}$

网友

2楼 · 编辑于 2024-06-17 10:50:00

df2 = df.groupby('email', as_index=False).nth([0,-1])

您可以尝试：

d={'mgr:mgr':'hired as mgr','emp:mgr':'promoted to mgr','emp:emp':'hired as emp','mgr:emp':'status change'}
#created a dict for mapping

最后：

df2.loc[:,'status']=df2.groupby('email')['level'].transform(':'.join).map(d)

df2的输出：

    date        email           level   status
0   01/01/2000  john@abc.com    mgr     hired as mgr
2   10/01/2001  john@abc.com    mgr     hired as mgr
3   14/02/2000  kimdo@abc.com   emp     promoted to mgr
4   19/10/2001  kimdo@abc.com   mgr     promoted to mgr
5   12/05/2000  waint@abc.com   emp     hired as emp
7   14/04/2001  waint@abc.com   emp     hired as emp
8   22/05/2000  neds@abc.com    mgr     status change
10  12/06/2001  neds@abc.com    emp     status change

相关问题更多 >

编程相关推荐

热门问题

热门文章