<p>使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.diff.html" rel="nofollow noreferrer">^{<cd1>}</a>+<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.cumsum.html" rel="nofollow noreferrer">^{<cd2>}</a>,如下所示:</p>
<p>准备:</p>
<pre><code>#Convert your column 'datetime' to datetime format if not already in that format
df['datetime'] = pd.to_datetime(df['datetime'])
# sort columns
df = df.sort_values(['visitorId','datetime'])
</code></pre>
<p>主要逻辑:</p>
<pre><code>df['group label'] = df['datetime'].diff().ge('2 days').groupby(df['visitorId']).cumsum()
</code></pre>
<p><strong>结果:</strong></p>
<pre><code>print(df)
visitorId datetime searchId group label
0 123 2020-06-06 abd 0
1 123 2020-06-07 cde 0
2 123 2020-06-08 dgh 0
3 123 2020-06-18 sdw 1
4 123 2020-06-21 hkl 2
5 345 2020-06-21 dsu 0
6 456 2020-06-19 sdh 0
7 456 2020-06-20 ckb 0
8 456 2020-07-24 etw 1
9 456 2020-08-09 ekn 2
</code></pre>