如何选择groupby中空值最少的组？

row_number |id |firstname | middlename | lastname | 0 | 1 | John | NULL | Doe | 1 | 1 | John | Jacob | Doe | 2 | 2 | Alison | Marie | Smith | 3 | 2 | NULL | Marie | Smith | 4 | 2 | Alison | Marie | Smith |

2条回答

网友

1楼 · 编辑于 2024-05-17 19:34:28

哦，您需要的是null值最少的行。我建议：

select t.*
from (select t.*,
             dense_rank() over (order by (case when firstname is null then 1 else 0 end) + 
                                         (case when middlename is null then 1 else 0 end) + 
                                         (case when lastname is null then 1 else 0 end)
                               ) as seqnum

      from t
     ) t
where seqnum = 1;

这是ANSI标准SQL。你知道吗

网友

2楼 · 编辑于 2024-05-17 19:34:28

如果你想这样做，你可以这样做：

df[df.assign(NC = df.isnull().sum(1)).groupby('id')['NC'].transform(lambda x: x == x.min())]

输出：

   row_number  id firstname middlename lastname
1           1   1      John      Jacob      Doe
2           2   2    Alison      Marie    Smith

对于断绳器：

添加行：

df.loc[4,['row_number','id','firstname','middlename','lastname']] = ['4',2,'Mary','Maxine','Maxwell']

然后使用groupby、transform和idxmin：

df[df.index == df.assign(NC = df.isnull().sum(1)).groupby('id')['NC'].transform('idxmin')]

输出：

  row_number id firstname middlename lastname
1          1  1      John      Jacob      Doe
2          2  2    Alison      Marie    Smith

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何选择groupby中空值最少的组？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >