对pandas groupby中的列子集应用函数

import pandas as pd import numpy as np np.random.seed(123) #reproducible ex df = pd.DataFrame(data = {"a": np.arange(10), "b": np.arange(10)[::-1], "c": np.random.choice(a = np.arange(10), size = 10)}, index = pd.Index(data = np.random.choice(a = [1,2,3], size = 10), name = "id")) #create a dict for all columns other than "c" and the function to do the transform fmap = {k: lambda x: (x - x.mean()) / x.std() for k in df.columns if k != "c"} df.groupby("id").transform(fmap) #yields error that "dict" is unhashable

1条回答

网友

1楼 · 发布于 2024-04-23 23:06:37

一种可能的解决方案是先按difference过滤列名称，因为dict还不能处理transfrom：

cols = df.columns.difference(['c'])
print (cols)
Index(['a', 'b'], dtype='object')

fmap = lambda x: (x - x.mean()) / x.std()
df[cols] = df.groupby("id")[cols].transform(fmap) 
print (df)
           a         b  c
id                       
3  -1.000000  1.000000  2
2  -1.091089  1.091089  2
1  -1.134975  1.134975  6
3   0.000000  0.000000  1
1  -0.529655  0.529655  3
2   0.218218 -0.218218  9
3   1.000000 -1.000000  6
2   0.872872 -0.872872  1
1   0.680985 -0.680985  0
1   0.983645 -0.983645  1

相关问题更多 >

编程相关推荐

热门问题

热门文章