<p>这绝对(可能)不是这样做的,但我们来看看:</p>
<pre><code>### Get the max of the timestampe into separate dataframes
df_max = df.loc[df.groupby(['name','activity',])['timestamp'].idxmax()].reset_index(drop=True)
df_min = df.loc[df.groupby(['name','activity',])['timestamp'].idxmin()].reset_index(drop=True)
### Merge those puppies on the index values
df_tot = df_max.merge(df_min, how='outer', left_index=True, right_index=True, suffixes= ('_max', '_min'))
### Subtract the max timestamp from the minimum timestamp
df_tot['net time'] = df_tot['timestamp_max'] - df_tot['timestamp_min']
### Drop unnecessary columns
df_tot.drop(['name_min','activity_min','timestamp_min','money_spent_min', 'money_spent_max','timestamp_max'], axis=1, inplace=True)
### Rename our columns
df_tot = df_tot.rename(columns={i:i.replace('_max', '') for i in df_tot.columns.values.tolist()})
### Set activity_number as the cumulative count of name
df_tot['activity_number'] = df_tot.groupby('name').cumcount() + 1
### Get the max of that result
df_tot = df_tot.loc[df_tot.groupby(['name',])['net time'].idxmax()].reset_index(drop=True)
### Rearrange our results
df_tot = df_tot.reindex(columns=['name','activity_number', 'net time']).copy()
</code></pre>
<p>输出:</p>
<pre><code> name activity_number net time
0 Chandler Bing 1 07:00:00
1 Harry Kane 2 04:00:00
2 Joey Tribbiani 2 01:00:00
3 John Doe 1 07:00:00
4 Monica Geller 1 02:00:00
5 Phoebe Buffey 2 02:00:00
6 Ross Geller 1 02:00:00
</code></pre>