Pandas在数据框中添加平均值、最大值、最小值作为列

statistics s_values year 1999 cigarette use 100 1999 cellphone use 310 1999 internet use 101 1999 alcohol use 100 1999 soda use 215 2000 cigarette use 315 2000 cellphone use 317 2000 internet use 325 2000 alcohol use 108 2000 soda use 200 2001 cigarette use 122 2001 cellphone use 311 2001 internet use 112 2001 alcohol use 144 2001 soda use 689

statistics s_values mean min max year 1999 alcohol use 100.0 104.0 100.0 108.0 1999 cellphone use 310.0 313.5 310.0 317.0 1999 cigarette use 100.0 207.5 100.0 315.0 1999 internet use 101.0 213.0 101.0 325.0 1999 soda use 215.0 207.5 200.0 215.0 2000 alcohol use 108.0 104.0 100.0 108.0 2000 cellphone use 317.0 313.5 310.0 317.0 2000 cigarette use 315.0 207.5 100.0 315.0 2000 internet use 325.0 213.0 101.0 325.0 2000 soda use 200.0 207.5 200.0 215.0 2001 alcohol use 144.0 104.0 100.0 108.0 2001 cellphone use 311.0 313.5 310.0 317.0 2001 cigarette use 122.0 207.5 100.0 315.0 2001 internet use 112.0 213.0 101.0 325.0 2001 soda use 689.0 207.5 200.0 215.0

gen_mean = df.groupby('statistics').mean() gen_min = df.groupby('statistics').min() gen_max = df.groupby('statistics').max() df.insert(2, 'Gen Avg', gen_mean) df.insert(3, 'Gen Max', gen_max) df.insert(4, 'Gen Min', gen_min)

2条回答

网友

1楼 · 编辑于 2024-05-14 10:09:15

尝试使用^{}+^{}：

df = df.join(
    df.groupby('statistics')['s_values'].aggregate(['mean', 'min', 'max']),
    on='statistics'
)

df：

         statistics  s_values        mean  min  max
year                                               
1999  cigarette use       100  179.000000  100  315
1999  cellphone use       310  312.666667  310  317
1999   internet use       101  179.333333  101  325
1999    alcohol use       100  117.333333  100  144
1999       soda use       215  368.000000  200  689
2000  cigarette use       315  179.000000  100  315
2000  cellphone use       317  312.666667  310  317
2000   internet use       325  179.333333  101  325
2000    alcohol use       108  117.333333  100  144
2000       soda use       200  368.000000  200  689
2001  cigarette use       122  179.000000  100  315
2001  cellphone use       311  312.666667  310  317
2001   internet use       112  179.333333  101  325
2001    alcohol use       144  117.333333  100  144
2001       soda use       689  368.000000  200  689

使用的框架：

df = pd.DataFrame({
    'year': [1999, 1999, 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2000, 2001,
             2001, 2001, 2001, 2001],
    'statistics': ['cigarette use', 'cellphone use', 'internet use',
                   'alcohol use', 'soda use',
                   'cigarette use', 'cellphone use', 'internet use',
                   'alcohol use', 'soda use',
                   'cigarette use', 'cellphone use', 'internet use',
                   'alcohol use', 'soda use'],
    's_values': [100, 310, 101, 100, 215, 315, 317, 325, 108, 200, 122, 311,
                 112, 144, 689]
}).set_index('year')

网友

2楼 · 编辑于 2024-05-14 10:09:15

groupby(...).mean()将返回一个数据帧，其中的行对应于组。您需要transform：

df['mean'] = df.groupby('statistics')['s_values'].transform('mean')
# I hope you get the idea how to get min/max

相关问题更多 >

编程相关推荐

热门问题

热门文章