pandas中的Cumsum:用变量合并两个数据帧

2024-05-26 20:45:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试创建一个从stock dataframe到order dataframe的stock赋值,当stock dataframe很简单时我可以这样做,但是当stock dataframe每个SKU有多个类别(类型)时,赋值就完成得很糟糕。 SKU是相同的,但现在它们有两种类型,因此,一个订单可以用两种类型的SKU来完成。 在带有类型的赋值数据框中可以看到,当有两种类型的库存时,订单的行是重复的

问题

import pandas as pd
import numpy as np

# initilize list
orders= {'order':[20001, 20002, 20003, 20004, 20005], 'SKU' :['A1', 'A2', 'A3','A4', 'A4'], 
           'Request':[20, 21, 19, 18, 20]} 
stock = {'SKU': ['A1', 'A2','A3', 'A4'], 
         'Available' : [10, 20, 5, 28],
         'Type' : ['Stock', 'Stock', 'Stock', 'Stock']} 
stock2 = {'SKU': ['A1', 'A2', 'A2', 'A3','A3', 'A4'], 
         'Type' : ['Stock', 'Stock', 'To make', 'Stock', 'To make', 'Stock'],
         'Available' : [10, 20, 5, 30, 5, 8]} 
# Create DataFrames
df_ped = pd.DataFrame(orders) 
df_st = pd.DataFrame(stock)
df_st2 = pd.DataFrame(stock2)

简单股票分配

#Merge
df_ped_st = pd.merge(df_ped, df_st, how='left', left_on='SKU', right_on='SKU')

#assignment
df_ped_st['difference'] = df_ped_st['Available'] - df_ped_st.groupby('SKU')['Request'].cumsum()
df_ped_st['balance'] = df_ped_st['difference'].clip(lower=0).shift(0).fillna(0)
df_ped_st['assig'] = np.where((df_ped_st['Request'] + df_ped_st['difference']) > 
                                   df_ped_st['Request'], df_ped_st['Request'], 
                                   (df_ped_st['Request'] + df_ped_st['difference']).clip(lower=0))
#Print
df_ped_st

结果正常

    order   SKU Request     Available   Type    difference  balance assig
0   20001   A1  20          10          Stock   -10         0       10
1   20002   A2  21          20          Stock   -1          0       20
2   20003   A3  19           5          Stock   -14         0       5
3   20004   A4  18          28          Stock   10          10      18
4   20005   A4  20          28          Stock   -10         0       10

带类型的库存分配

#merge
df_ped_st2 = pd.merge(df_ped, df_st2, how='left', left_on='SKU', right_on='SKU')

#assignment
df_ped_st2['difference'] = df_ped_st2['Available'] - df_ped_st2.groupby('SKU')['Request'].cumsum()
df_ped_st2['`balance'] = df_ped_st2['difference'].clip(lower=0).shift(0).fillna(0)
df_ped_st2['assig'] = np.where((df_ped_st2['Request'] + df_ped_st2['difference']) > 
                                   df_ped_st2['Request'], df_ped_st2['Request'], 
                                   (df_ped_st2['Request'] + df_ped_st2['difference']).clip(lower=0))
#Print
df_ped_st2

结果错误

Pedido      SKU Request   Type       Available   difference balance assign
0   20001   A1  20        Stock      10         -10         0       10
1   20002   A2  21        Stock      20         -1          0       20
2   20002   A2  21        To make     5         -37         0       0
3   20003   A3  19        Stock      30          11         11      19
4   20003   A3  19        To make     5         -33         0       0
5   20004   A4  18        Stock       8         -10         0       8
6   20005   A4  20        Stock       8         -30         0       0

预期结果

Pedido      SKU Request   Type       Available   difference balance assign
0   20001   A1  20        Stock      10         -10         0       10
1   20002   A2  21        Stock      20         -1          0       20
2   20002   A2  21        To make     5          4          4       1
3   20003   A3  19        Stock      30          11         11      19
4   20003   A3  19        To make     5          5          5       0
5   20004   A4  18        Stock       8         -10         0       8
6   20005   A4  20        Stock       8         -20         0       0

Tags: a2类型dfrequeststocka3a4available

热门问题