如果不存在并基于2列条件在df中添加行

2条回答

网友

1楼 · 编辑于 2024-06-16 10:00:44

使用：

print (df)
   Product_id    month  sales
0           1  01-2018     25
1           1  02-2018     34
2           1  06-2018     29 <- changed dates
3           1  04-2018     45
4           2  02-2018      3
5           2  04-2018      2

df['month'] = pd.to_datetime(df['month'])

df = (df.set_index(['month','Product_id'])['sales']
        .unstack(fill_value=0)
        .asfreq('MS', fill_value=0)
        .unstack()
        .reset_index(name='value'))
print (df)
    Product_id      month  value
0            1 2018-01-01     25
1            1 2018-02-01     34
2            1 2018-03-01      0
3            1 2018-04-01     45
4            1 2018-05-01      0
5            1 2018-06-01     29
6            2 2018-01-01      0
7            2 2018-02-01      3
8            2 2018-03-01      0
9            2 2018-04-01      2
10           2 2018-05-01      0
11           2 2018-06-01      0

网友

2楼 · 编辑于 2024-06-16 10:00:44

我想不出任何直接的解决办法。您可以使用以下代码段

import pandas as pd

df = pd.DataFrame([{'Product_id': 1, 'month': '01-2018', 'sales': 25},
                   {'Product_id': 1, 'month': '02-2018', 'sales': 34},
                   {'Product_id': 1, 'month': '03-2018', 'sales': 29},
                   {'Product_id': 1, 'month': '04-2018', 'sales': 45},
                   {'Product_id': 2, 'month': '02-2018', 'sales': 3},
                   {'Product_id': 2, 'month': '04-2018', 'sales': 2}])


# Maintaining separate columns for month and year. Just easy to groupby.
# You can also convert 'month' column to date object
df[['month_no','year']] = df.month.str.split('-', expand=True)
df['month_no'] = df['month_no'].astype(int) 
df['year'] = df['year'].astype(int) 

unique_product_ids = df['Product_id'].unique()
unique_years = df['year'].unique()
grpby_df = df.groupby(by=['Product_id','year'])

for unique_product_id in unique_product_ids:
    for unique_year in unique_years:
        try:
            subset_df = grpby_df.get_group((unique_product_id, unique_year))
        except KeyError:
            continue
        start_month = min(subset_df['month_no'])
        end_month = 12 # Assuming sales=0 for all subsequent months
    months_list = list(subset_df['month_no'])
    for i in range(start_month, end_month +1):
        if i not in months_list:
            df = df.append(
                        {
                        'Product_id': unique_product_id, 
                        'month_no': i, 
                        'year': unique_year, 
                        'sales': 0
                        },
                        ignore_index = True)

结果将得到总共23行。产品1为12，产品2为11（因为我们忽略了第一个月）

相关问题更多 >

编程相关推荐

热门问题

热门文章

如果不存在并基于2列条件在df中添加行

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >