Python：如何选择具有条件的列？

df id d1 d2 d3 a1 a2 a3 0 474 0.000243 0.000243 0.001395 bank bank atm 1 964 0.000239 0.000239 0.000899 bank bank bank 2 4823 0.000472 0.000472 0.000834 fuel fuel fuel 3 7225 0.002818 0.002818 0.023900 bank bank fuel 4 7747 0.001036 0.001036 0.001415 dentist dentist bank

3条回答

网友

1楼 · 编辑于 2024-06-07 04:13:55

如果要按列表选择列按^{}获取列名，请重命名列，然后对新列使用^{}中的^{}：

col1 = ['d1','d2','d3']
col2 = ['a1','a2','a3']

pos = df[col1].idxmin(axis=1).map(dict(zip(col1, col2)))

df = df[['id']].assign(d = df[col1].min(axis=1), a = df.lookup(df.index, pos))
print (df)
     id         d        a
0   474  0.000243     bank
1   964  0.000239     bank
2  4823  0.000472     fuel
3  7225  0.002818     bank
4  7747  0.001036  dentist

网友

2楼 · 编辑于 2024-06-07 04:13:55

@yatu的解决方案是这里的动力——在我看到从宽到长的任何地方，我都会测试多索引上的堆栈是否适合：）：

#set id as index:
df = df.set_index('id')

#split columns based on the numbers, and expand=True
#this converts the columns into a MultiIndex
#drop the last level, as it is empty text
df.columns = df.columns.str.split("(\d+)",expand=True).droplevel(-1)

#get indices for a min on groupby:
ind = df.stack().groupby('id').idxmin().d

#get minimum rows : 
df.stack().loc[ind].droplevel(-1)


         a          d
id      
474     bank    0.000243
964     bank    0.000239
4823    fuel    0.000472
7225    bank    0.002818
7747    dentist 0.001036

网友

3楼 · 编辑于 2024-06-07 04:13:55

您可以在这里使用^{}来获取长格式的数据帧，并将[d,a]指定为stubname。然后按id分组，索引取d的^{}：

df = (pd.wide_to_long(df, stubnames=['d','a'], suffix= '\d+', i='id', j='j')
        .reset_index().drop('j',1))
df = df.loc[df.groupby('id').d.idxmin().values]

print(df)

     id         d        a
0   474  0.000243     bank
1   964  0.000239     bank
2  4823  0.000472     fuel
3  7225  0.002818     bank
4  7747  0.001036  dentist

其中，以上面的pd.wide_to_long表示数据帧为：

pd.wide_to_long(df, stubnames=['d','a'], suffix= '\d+', i='id', j='j')

              d        a
id   j                   
474  1  0.000243     bank
964  1  0.000239     bank
4823 1  0.000472     fuel
7225 1  0.002818     bank
7747 1  0.001036  dentist
474  2  0.000243     bank
964  2  0.000239     bank
4823 2  0.000472     fuel
7225 2  0.002818     bank
7747 2  0.001036  dentist
474  3  0.001395      atm
964  3  0.000899     bank
4823 3  0.000834     fuel
7225 3  0.023900     fuel
7747 3  0.001415     bank

我们只需要在id中分组并找到最小值的索引

相关问题更多 >

编程相关推荐

热门问题

热门文章