擅长:python、mysql、java
<p>受@cphlewis的启发,这里是我的groupBy方法,它每年分组,但从给定的月份开始:</p>
<pre><code>rng = pd.date_range('1/1/2011', periods=25, freq='M')
ts = pd.DataFrame(np.random.randn(len(rng)), index=rng, columns=['ts'])
def groupByYearMonth(ts, month):
starts = ts[ts.index.month==month].index # Fix if multiple entries per month.
if starts[0] > ts.index[0]:
ts.loc[ts.index < starts[0], 'group'] = starts[0].year - 1
for start in starts:
end = '%d-%d'%(start.year+1, start.month-1)
ts.loc[start:end, 'group'] = start.year
return ts.groupby('group')
groupBy = groupByYearMonth(ts, 3)
print groupBy.mean(), groupBy.size()
ts
group
2010 0.638609
2011 -0.124718
2012 0.385539 group
2010 2
2011 12
2012 11
dtype: int64
</code></pre>