合并行和计数值

1条回答

网友

1楼 · 发布于 2024-05-16 23:07:41

听起来你想做的是找出每个元素之间的关联。如果两个（或更多）订单有“糖果”，那么它们包含多少其他产品

这是我能想到的最好的了。首先，按每个产品分组，以便找到拥有该产品的所有订单。然后，从原始数据帧中提取子集，得到每个乘积的数量之和

# group by the products
products = df.groupby("Product")

# each groupby element is a tuple
# the first entry is the value (in this case, the Product)
# the second is a dataframe
# iterate through each of these groups
for p in products:
  sub_select = df[df["OrderNum"].isin(p[1]['OrderNum'])]
  quantities = sub_select.groupby("Product").Quantity.sum()

  # print the name of the product that we grouped by
  # and convert the sums to a dictionary for easier reading
  print(p[0], quantities.to_dict())
  # Candy :  {'Candy': 4, 'Gum': 2}
  # Chocolate :  {'Chocolate': 10}
  # Gum :  {'Candy': 4, 'Soda': 3, 'Gum': 7}
  # Soda :  {'Soda': 3, 'Gum': 5}

sub_select将是原始数据帧的子集。例如，它将包含包含糖果的所有订单的所有行quantities然后将所有这些订单按产品分组，以获得所有匹配订单中每个产品的数量总和

相关问题更多 >

编程相关推荐

热门问题

热门文章

合并行和计数值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >