创建timeseries存储箱和索引

Studynumber Time Concentration 1 20 80 1 40 60 1 60 40 2 15 95 2 44 70 2 65 30

2条回答

网友

1楼 · 编辑于 2024-05-12 22:51:37

使用^{}从groupby聚合中添加一列，这将创建一个Series，其索引与原始df对齐，以便您可以正确地重新分配它：

In [4]:
df['meanconcentration'] = df.groupby('roundtime')['Concentration'].transform('mean')
df

Out[4]:
   Studynumber  Time  Concentration  roundtime  meanconcentration
0            1    20             80         20               87.5
1            1    40             60         40               65.0
2            1    60             40         60               35.0
3            2    15             95         20               87.5
4            2    44             70         40               65.0
5            2    65             30         60               35.0

网友

2楼 · 编辑于 2024-05-12 22:51:37

你在写什么

Then I want to get this back into the main frame to calculate the difference between each actual concentration and the mean concentration

在Data Wrangling in Pandas中groupby-apply的文档中出现了一些非常类似的内容。请注意，您可以直接计算：

>>> data.groupby('roundtime').apply(
    lambda g: g.Concentration - g.Concentration.mean())
roundtime   
20         0   -7.5
           3    7.5
40         1   -5.0
           4    5.0
60         2    5.0
           5   -5.0
Name: Concentration, dtype: float64

请注意，您可以很容易地对此应用.reset_index()，如果需要，可以将其合并回原始数据帧，等等

另一种方法是计算平均值，然后直接将其合并：

pd.merge(
    data.groupby('roundtime').mean(),
    data,
    left_index=True,
    right_on='roundtime',
    how='right')

（请注意，这将为原始列创建列“Concentration\u X”for the mean, and'Concentration\u Y`）。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

创建timeseries存储箱和索引

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >