基于pandas（python）中另一列中的值生成列

drug_id WD lexapro.1 flu-like symptoms lexapro.1 dizziness lexapro.1 headache lexapro.14 Dizziness lexapro.14 headaches lexapro.23 extremely difficult lexapro.32 cry at anything lexapro.32 Anxiety

id drug_id WD 1 lexapro.1 flu-like symptoms 1 lexapro.1 dizziness 1 lexapro.1 headache 2 lexapro.14 Dizziness 2 lexapro.14 headaches 3 lexapro.23 extremely difficult 4 lexapro.32 cry at anything 4 lexapro.32 Anxiety

2条回答

网友

1楼 · 编辑于 2024-05-19 00:42:25

Boud提到的shift+cumsum模式很好，只需确保首先按drug_id排序。比如说

df = df.sort_values('drug_id')
df['id'] = (df['drug_id'] != df['drug_id'].shift()).cumsum()

不涉及数据帧排序的另一种方法是将一个数字映射到每个唯一的drug_id。你知道吗

uid = df['drug_id'].unique() 
id_map = dict((x, y) for x, y in zip(uid, range(1, len(uid)+1))) 
df['id'] = df['drug_id'].map(id_map)

网友

2楼 · 编辑于 2024-05-19 00:42:25

使用shift+cumsum模式：

(df.drug_id!=df.drug_id.shift()).cumsum()
Out[5]: 
0    1
1    1
2    1
3    2
4    2
5    3
6    4
7    4
Name: drug_id, dtype: int32

相关问题更多 >

编程相关推荐

热门问题

热门文章