lambda函数引用未在函数中指定的列值

def fill_MSZoning(row): if row['MSZoning'] == 'C': return 69.7 elif row['MSZoning'] == 'FV': return 59.49 elif row['MSZoning'] == 'RH': return 58.92 elif row['MSZoning'] == 'RL': return 74.68 else: return 52.4

1条回答

网友

1楼 · 发布于 2024-04-20 07:48:55

你可以这样做

import pandas as pd
import numpy as np

## creating dummy data
np.random.seed(100)

raw = {
    "group": np.random.choice("A B C".split(), 10),
    "value": [np.nan if np.random.rand()>0.8 else np.random.choice(100) for _ in range(10)]
}

df = pd.DataFrame(raw)
display(df)

## calculate mean
means = df.groupby("group").mean()
display(means)

填入组平均值

## fill with mean value
def fill_group_mean(x):
    group_mean = means["value"].loc[x["group"].max()]
    return x["value"].mask(x["value"].isna(), group_mean)


r= df.groupby("group").apply(fill_group_mean)
r.reset_index(level=0)

输出

group   value
0   A   NaN
1   A   24.0
2   A   60.0
3   C   9.0
4   C   2.0
5   A   NaN
6   C   NaN
7   B   83.0
8   C   91.0
9   C   7.0



group   value
0   A   42.00
1   A   24.00
2   A   60.00
5   A   42.00
7   B   83.00
3   C   9.00
4   C   2.00
6   C   27.25
8   C   91.00
9   C   7.00

相关问题更多 >

编程相关推荐

热门问题

热门文章

lambda函数引用未在函数中指定的列值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >