<p>首先确保列是datetime64</p>
<pre><code>df = pd.DataFrame({
"date":["2020-12-01 08:18:29", "2020-12-01 14:53:17", "2020-12-01 17:29:13", "2020-12-02 17:00:01", "2020-12-02 18:09:52"],
"new_sentiment":["NEUTRAL", "NEUTRAL", "NEGATIVE", "NEUTRAL", "POSITIVE"],
"unit":[1, 2, 1, 1, 1]
})
print(df.dtypes)
date object
new_sentiment object
unit int64
dtype: object
</code></pre>
<p>列的最新类型</p>
<pre><code>df["date"] = pd.to_datetime(df["date"])
print(df.dtypes)
date datetime64[ns]
new_sentiment object
unit int64
dtype: object
df
date new_sentiment unit
0 2020-12-01 NEUTRAL 1
1 2020-12-01 NEUTRAL 2
2 2020-12-01 NEGATIVE 1
3 2020-12-02 NEUTRAL 1
4 2020-12-02 POSITIVE 1
</code></pre>
<p>因此,如果需要根据日期计算<code>new_sentiment</code></p>
<pre><code>df.groupby("date")["new_sentiment"].value_counts()
date new_sentiment
2020-12-01 NEUTRAL 2
NEGATIVE 1
2020-12-02 NEUTRAL 1
POSITIVE 1
Name: new_sentiment, dtype: int64
</code></pre>
<p>另一方面,如果需要计算列<code>unit</code></p>
<pre><code>df.groupby(["date", "new_sentiment"])["unit"].sum()
date new_sentiment
2020-12-01 NEGATIVE 1
NEUTRAL 3
2020-12-02 NEUTRAL 1
POSITIVE 1
Name: unit, dtype: int64
</code></pre>