删除没有最长lis的数据帧行

df = pd.DataFrame(data = [["a", "b", "c", ["d", "e"]], ["a", "b", "c", ["e"]], ["l", "m", "n", ["o"]], columns = ["c1", "c2", "c3", "c4"]) # max doesn't evaluate length ~ this is wrong df.groupby(by=["c1", "c2", "c3"])["c4"].apply(max) c1 c2 c3 a b c [e] l m n [o] Name: c4, dtype: object # but length does ~ but using an int to equate to another row isn't guaranteed df.groupby(by=["c1", "c2", "c3"])["c4"].apply(len) c1 c2 c3 a b c 2 l m n 1 Name: c4, dtype: int64

1条回答

网友

1楼 · 发布于 2024-04-25 00:32:22

这个怎么样：

df = pd.DataFrame(data =[["a", "b", "c", ["d", "e"]],
                         ["a", "b", "c", ["e"]],
                         ["l", "m", "n", ["o"]]],
                  columns = ["c1", "c2", "c3", "c4"])

df['len'] = df['c4'].apply(len)

max_groups = df[df.groupby(['c1', 'c2', 'c3'])['len'].transform(max) == df['len']]

我们在c4中添加一个与列表长度相对应的额外列，然后将数据帧过滤到那些c4长度与分组的最大长度c4相同的记录。它将max_groups返回为：

  c1 c2 c3      c4  len
0  a  b  c  [d, e]    2
2  l  m  n     [o]    1

相关问题更多 >

编程相关推荐

热门问题

热门文章