Pandas分时器：分组的边界

2条回答

网友

1楼 · 编辑于 2024-05-16 14:27:35

不雅，但我不认为groupby有这样的论据：

import pandas as pd
from numpy.random import randn

rng = pd.date_range('1/1/2011', periods=25, freq='M')
ts = pd.Series(randn(len(rng)), index=rng)

def truncYears(ts, month):
    starts = ts[ts.index.month==month].index  # Fix if multiple entries per month.

    groups = {}
    if starts[0] > ts.index[0]:
        groups[ts.index[0]] = ts[ts.index < starts[0]]
    for start in starts:
        end = '%d-%d'%(start.year+1, start.month-1)
        print(start, end)
        groups[start] = ts[start:end]

    return groups

groups = truncYears(ts, 3)
for k in groups:
    print(groups[k])

结果（请注意dict键未排序，因此年份不按顺序排列）：

2011-01-31   -1.719806
2011-02-28   -0.657064
Freq: M, dtype: float64
2012-03-31    1.200984
2012-04-30   -0.496715
2012-05-31   -0.998218
2012-06-30    1.711504
2012-07-31    0.304211
2012-08-31    1.091810
2012-09-30   -0.716785
2012-10-31   -0.996493
2012-11-30   -0.541812
2012-12-31    1.027787
2013-01-31    0.249775
Freq: M, dtype: float64
2011-03-31   -1.406736
2011-04-30    0.245077
2011-05-31   -0.010090
2011-06-30   -1.459824
2011-07-31    0.150871
2011-08-31   -1.223533
2011-09-30    0.859539
2011-10-31    0.623674
2011-11-30   -2.071204
2011-12-31    0.254750
2012-01-31    0.667076
2012-02-29    0.076249
Freq: M, dtype: float64

网友

2楼 · 编辑于 2024-05-16 14:27:35

受@cphlewis的启发，这里是我的groupBy方法，它每年分组，但从给定的月份开始：

rng = pd.date_range('1/1/2011', periods=25, freq='M')
ts = pd.DataFrame(np.random.randn(len(rng)), index=rng, columns=['ts'])

def groupByYearMonth(ts, month):
    starts = ts[ts.index.month==month].index  # Fix if multiple entries per month.

    if starts[0] > ts.index[0]:
        ts.loc[ts.index < starts[0], 'group'] = starts[0].year - 1
    for start in starts:
        end = '%d-%d'%(start.year+1, start.month-1)
        ts.loc[start:end, 'group'] = start.year
    return ts.groupby('group')

groupBy = groupByYearMonth(ts, 3)
print groupBy.mean(), groupBy.size()
             ts
group          
2010   0.638609
2011  -0.124718
2012   0.385539 group
2010      2
2011     12
2012     11
dtype: int64

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas分时器：分组的边界

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >