获取由一列划分的第一个匹配项，该列由另一列排序

import pandas as pd df = pd.DataFrame({"ID":['1','1','1','2','2'], "LINE":['1','3','2','1','2'], "TYPE":['0','1','1','1','0']}) # print results print(df.head()) # a function to label the first type 1 for each ID sorted by line # currently it only filters to type 1 def label (row): if row.TYPE == '1' : return True # add the label in the dataframe df['label'] = df.apply (lambda row: label(row), axis=1) # print results print(df.head())

1条回答

网友

1楼 · 发布于 2024-04-19 14:49:40

使用query筛选TYPE == 1，sort_values对LINE进行排序，最后使用GroupBy.head获得第一次出现：

s = df.query('TYPE == "1"').sort_values('LINE').groupby('ID')['TYPE'].head(1)
df['label'] = df.index.isin(s.index)

或者使用drop_duplicates，哪一种效率更高：

s = df.query('TYPE == "1"').sort_values('LINE').drop_duplicates('ID')
df['label'] = df.index.isin(s.index)

  ID LINE TYPE  label
0  1    1    0  False
1  1    3    1  False
2  1    2    1   True
3  2    1    1   True
4  2    2    0  False

相关问题更多 >

编程相关推荐

热门问题

热门文章