Pandas无法使用DataFrame找到第三季度哪些门店的季度增长率良好

Store Date Weekly_Sales 0 1 2012-03-31 18951097.69 1 1 2012-06-30 21036965.58 2 1 2012-09-30 18633209.98 3 1 2012-12-31 9580784.77 4 2 2012-03-31 22543946.63 5 2 2012-06-30 25085123.61 6 2 2012-09-30 22396867.61 7 2 2012-12-31 11470757.52

new_df = [] for index, row in monthly_sales.iterrows(): if index == 1: ----Not sure what condition to put here q2 = row['Weekly_Sales'] q3 = row['Weekly_Sales'] growth_rate = (q3 - q2)/(q2*100) new_df.append([row['Store'],growth_rate]) #print(index, row['Store'],row['Date'], row['Weekly_Sales']) #exit; new_df

2条回答

网友

1楼 · 编辑于 2024-04-26 01:00:19

您可以尝试：

df["Date"] = pd.to_datetime(df["Date"])
df["Weekly_Sales"] = pd.to_numeric(df["Weekly_Sales"])


out = df.sort_values(by=["Store", "Date"]) \
        .groupby(["Store"]) \
        .agg(growth_Q3=("Weekly_Sales", lambda x: (x.iloc[2] - x.iloc[1])/(x.iloc[1]) * 100))

解释：

将列转换为适当的格式（如果不是，请执行此操作）。要查看格式，可以使用^{}
1. 使用^{}将Dates转换为datetime对象
2. 使用^{}将Weekly_Sales转换为数字
按Store和Dates对值进行排序，以确保日期按时间顺序排序。我们可以使用^{}
Groupbystore来计算它们每个上的growth_rate
对于每个组，使用自定义聚合函数使用^{}聚合行：
1. 我们首先使用lambda函数计算增长率。我们使用^{}来选择quarter2和quarter3值。使用的公式是：(Q3-Q2)/Q2 * 100
2. 然后我们使用方便的表示法isagg函数将结果重命名为growth_Q3。我们在lambda之前使用"Weekly_Sales"表示lambda函数将应用于"Weekly_Sales"列

完整代码+插图：

# Step 1 (Optionnal if types are already correct)
print(df.dtypes)
# Store                    int64
# Date                    object
# Weekly_Sales            object
# dtype: object

df["Date"] = pd.to_datetime(df["Date"])
df["Weekly_Sales"] = pd.to_numeric(df["Weekly_Sales"])
print(df.dtypes)
# Store                    int64
# Date            datetime64[ns]
# Weekly_Sales           float64
# dtype: object

# Step 2 (Optionnal if data already sorted)
print(df.sort_values(by=["Store", "Date"]))
#    Store       Date  Weekly_Sales
# 0      1 2012-03-31   18951097.69
# 1      1 2012-06-30   21036965.58
# 2      1 2012-09-30   18633209.98
# 3      1 2012-12-31    9580784.77
# 4      2 2012-03-31   22543946.63
# 5      2 2012-06-30   25085123.61
# 6      2 2012-09-30   22396867.61
# 7      2 2012-12-31   11470757.52

# Step 4
print(df.sort_values(by=["Store", "Date"])
        .groupby(["Store"])
        .agg(growth_Q3=("Weekly_Sales", lambda x: (x.iloc[2] - x.iloc[1])/x.iloc[1] * 100)))
#        growth_Q3
# Store
# 1     -11.426342
# 2     -10.716535

网友

2楼 · 编辑于 2024-04-26 01:00:19

#get the quarters into a different column : 
df['Quarter'] = df.Date.dt.quarter
#get the groupings for the percent change from quarters 2 to 3 : 
pct_change = (df.query('Quarter in [2,3]')
              .groupby('Store')
              .Weekly_Sales
              .pct_change()
              .mul(100)
              .dropna()
             )
pct_change

2   -11.426342
6   -10.716535
Name: Weekly_Sales, dtype: float64

#get store number at third quarter:
store = df.loc[df['Quarter']==3,'Store']

2    1
6    2
Name: Store, dtype: int64

#merge the two objects
pd.concat([store,pct_change],axis=1)

    Store   Weekly_Sales
2   1   -11.426342
6   2   -10.716535

另一种方法：

我们知道数据是为每个商店安排的，每个商店有4行，表示季度。。。第2季度和第3季度将在每个分组门店的指数1和2上：

filtered = (df
             #the nth function allows us to pick rows per group
            .groupby('Store').nth([1,2])
            .pivot(columns='Quarter',values='Weekly_Sales')
            .pct_change(axis=1)
            .mul(100)
            .dropna(axis=1)
            .rename(columns={3:'growth'})
           )

filtered

Quarter growth
Store   
1       -11.426342
2       -10.716535

相关问题更多 >

编程相关推荐

热门问题

热门文章