Python根据一列进行分组,计算其对应列的累计值

2024-04-27 05:14:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我今天不得不用python处理一些问题

根据销售区域分组,累计相应的发货,并对发货结果进行四舍五入。大区域的值为NULL或空,希望跳过统计。输出csv文件需要有一个大区域+销售区域+发货量+消息

enter image description here

Id, Message, region, shipping volume, sales area 1, natural, AAA, 25.02, 111 2, Nature, BBB, 63.22, 222 3, China, CCC, 554.21, 333 4, yes, AAA, 63.22, NULL 5, large, NULL, 645.55, 222 6, DAD ,,, 333 7, large, BBB, 69.22, 222 8, NULL, DDD, NULL, 444

我确实试过:

enter image description here

结果:

enter image description here

import pandas as pd
import csv
import math
  
df = pd.read \ _csv (r'G: \\ 360MoveData \\ Users \\ Hasee \\ Desktop \\ Business List.csv ')
  
 #Discard missing values
 # df = df.dropna (axis = 0, subset = \ ["Sales Area", "erp average daily shipment amount" \])
  
for i in range (len (df)):
  if (df \ ['Sales Area' \] \ [i \] == '' or df \ ['Sales Area' \] \ [i \] == "NULL"):
   df = df.drop (i)
  
 #Rounded up  
  
 #df \ _sum = df.groupby ('Sales Area'). agg ({"Shipping Volume": sum}). reset \ _index ()
df \ _sum = df.groupby ("Sales Area"). agg ({"Shipping Volume": sum})
  
 #df \ _sum = df.insert (3, "Area name", df \ _sum ('Area'))
  
df \ _sum.to \ _csv (r'G: \\ 360MoveData \\ Users \\ Hasee \\ Desktop \\ Count1.csv ')

问题: 1.我不知道如何添加地区列表。 2.我想用math.ceil()来取整销售区域和清单的值,我不知道如何添加它。 3.第一行是0.0,应该是空值和空值引起的,如何避免


Tags: csvimport区域dfareamathnullusers
1条回答
网友
1楼 · 发布于 2024-04-27 05:14:11

使用groupby如下

1.通过修剪空白将csv读入数据帧。

df = pd.read_csv('/Users/prince/Downloads/test2.csv', sep=',', skipinitialspace=True)

输出为

Id  Message region  shipping volume  sales area
  1  natural    AAA            25.02       111.0
  2   Nature    BBB            63.22       222.0
  3    China    CCC           554.21       333.0
  4      yes    AAA            63.22         NaN
  5    large    NaN           645.55       222.0
  6     DAD     NaN              NaN       333.0
  7    large    BBB            69.22       222.0
  8      NaN    DDD              NaN       444.0

2.现在删除NaN值

df = df.dropna()

输出为

Id  Message region  shipping volume  sales area
  1  natural    AAA            25.02       111.0
  2   Nature    BBB            63.22       222.0
  3    China    CCC           554.21       333.0
  7    large    BBB            69.22       222.0

3.现在做groupby

在你的问题中,你在做groupby on Sales Area

df = df.groupby ("sales area", as_index=False). agg ({"shipping volume": 'sum'})

这样就可以输出

sales area  shipping volume
      111.0            25.02
      222.0           132.44
      333.0           554.21

4.如果要在aggregate中包含其他列,请执行以下操作

df.groupby ("sales area", as_index=False). agg ({"shipping volume": 'sum', 'Message' : 'first', 'region' : 'first'})

这将为您提供以下输出

sales area  shipping volume  Message region
      111.0            25.02  natural    AAA
      222.0           132.44   Nature    BBB
      333.0           554.21    China    CCC

相关问题 更多 >