按月份和年份分组问题的回答

按月份和年份分组

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

您可以使用重采样或<code>Grouper</code>（在引擎盖下重采样） 首先确保datetime列实际上是datetimes（用<code>pd.to_datetime</code>点击它）。如果是DatetimeIndex，则更容易： <pre><code>In [11]: df1 Out[11]: abc xyz Date 2013-06-01 100 200 2013-06-03 -20 50 2013-08-15 40 -5 2014-01-20 25 15 2014-02-21 60 80 In [12]: g = df1.groupby(pd.Grouper(freq="M")) # DataFrameGroupBy (grouped by Month) In [13]: g.sum() Out[13]: abc xyz Date 2013-06-30 80 250 2013-07-31 NaN NaN 2013-08-31 40 -5 2013-09-30 NaN NaN 2013-10-31 NaN NaN 2013-11-30 NaN NaN 2013-12-31 NaN NaN 2014-01-31 25 15 2014-02-28 60 80 In [14]: df1.resample("M", how='sum') # the same Out[14]: abc xyz Date 2013-06-30 40 125 2013-07-31 NaN NaN 2013-08-31 40 -5 2013-09-30 NaN NaN 2013-10-31 NaN NaN 2013-11-30 NaN NaN 2013-12-31 NaN NaN 2014-01-31 25 15 2014-02-28 60 80 </code></pre> 注：以前<code>pd.Grouper(freq="M")</code>写为<code>pd.TimeGrouper("M")</code>。后者现在从0.21开始就被弃用了 <hr/> 我原以为下面的方法行得通，但行不通（因为<code>as_index</code>没有得到尊重？我不确定）。为了利益，我把这个包括在内 如果它是一列（必须是datetime64列！如我所说，用<code>to_datetime</code>点击它），您可以使用PeriodIndex： <pre><code>In [21]: df Out[21]: Date abc xyz 0 2013-06-01 100 200 1 2013-06-03 -20 50 2 2013-08-15 40 -5 3 2014-01-20 25 15 4 2014-02-21 60 80 In [22]: pd.DatetimeIndex(df.Date).to_period("M") # old way Out[22]: <class 'pandas.tseries.period.PeriodIndex'> [2013-06, ..., 2014-02] Length: 5, Freq: M In [23]: per = df.Date.dt.to_period("M") # new way to get the same In [24]: g = df.groupby(per) In [25]: g.sum() # dang not quite what we want (doesn't fill in the gaps) Out[25]: abc xyz 2013-06 80 250 2013-08 40 -5 2014-01 25 15 2014-02 60 80 </code></pre> 为了得到期望的结果，我们必须重新索引

按月份和年份分组

1 个回答

相关Python问题