是否删除groupby中的重复行？

2条回答

网友

1楼 · 编辑于 2024-05-14 04:09:38

您的数据包含重复项，可能是因为您只包含列的一个子集。除了价格，你还需要其他数据（例如，两个不同的交易日可以以相同的价格成交，但你不能从这两个交易日中合计成交量）。在

假设价格对于给定的时间戳、市场和公司是唯一的，并且您首先对时间戳列进行排序（如果有）（如果每个公司和市场只有一个价格，则不需要）：

df = pd.DataFrame({
    'company': ['EK', 'SQ', 'EK', 'EK', 'EK', 'SQ', 'EK'],
    'date': ['2018-08-13'] * 3 + ['2018-08-14'] * 4,
    'market': ['LA'] * 7,
    'price': [206] * 3 + [36] * 4})

>>> (df.groupby(['market', 'date', 'company'])['price']
     .agg({'price': 'last', 'volume': 'count'}[['price', 'volume']]
     .reset_index()

  market        date company  price  volume
0     LA  2018-08-13      EK    206       2
1     LA  2018-08-13      SQ    206       1
2     LA  2018-08-14      EK     36       3
3     LA  2018-08-14      SQ     36       1

网友

2楼 · 编辑于 2024-05-14 04:09:38

只需使用drop_duplicates列['market', 'company', 'price']：

>>> df.drop_duplicates(['market', 'company', 'price'])
  market company  price  volume
0     LA      EK  206.0       2
1     LA      SQ  206.0       1
3     LA      EK   36.0       3
5     LA      SQ   36.0       1

相关问题更多 >

编程相关推荐

热门问题

热门文章

是否删除groupby中的重复行？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >