如何比较pandas数据帧中第二列的值和第一列的相同值？

2条回答

网友

1楼 · 编辑于 2024-05-19 20:53:55

我认为使用groupby、grouper和{}很简单，如下所示：

df['Id'] = df.groupby([pd.Grouper(freq='30T', key='Datetime'), 'Name']).ngroup().add(1)


Out[423]:
     Name            Datetime  Id
0     Bob 2018-04-26 12:00:00   1
1  Claire 2018-04-26 12:00:00   2
2     Bob 2018-04-26 12:10:00   1
3     Bob 2018-04-26 12:20:00   1
4  Claire 2018-04-27 08:30:00   3
5     Bob 2018-04-27 09:30:00   4

网友

2楼 · 编辑于 2024-05-19 20:53:55

我将按名称、日期时间对数据帧进行排序，以标识不同的组，然后按原始数据帧顺序为每个组分配一个Id值。在

代码可以是：

# sort data frame on Name and datetime
df.sort_values(['Name', 'Datetime'], inplace=True)
df1 = df.shift()
# identify new Ids
df.loc[(df1.Name!=df.Name)
       |(df.Datetime-df1.Datetime>pd.Timedelta(minutes=30)), 'tmp'] = 1
del df1   # non longer usefull

# ok, one different tmp value for each group
df['tmp'] = df['tmp'].cumsum().ffill()

# compute Ids in original dataframe orders
ids = pd.DataFrame(df['tmp'].drop_duplicates().sort_index())
ids['Id'] = ids.reset_index(drop=True).index + 1

# and get the expected result
df = df.reset_index().merge(ids, on='tmp').set_index('index').sort_index()\
     .drop(columns='tmp').rename_axis(None)

正如预期的那样：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何比较pandas数据帧中第二列的值和第一列的相同值？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >