python数据帧过滤条件:任何更快的方法

2024-05-16 10:07:28 发布

您现在位置:Python中文网/ 问答频道 /正文

parts_list = imp_parts_df['Parts'].tolist()
sub_week_list = ['2016-12-11', '2016-12-04', '2016-11-27', '2016-11-20', '2016-11-13'] 
i = 0
start = DT.datetime.now()
for p in parts_list:
       for thisdate in sub_week_list:
            thisweek_start = pd.to_datetime(thisdate, format='%Y-%m-%d') #'2016/12/11'
            thisweek_end = thisweek_start + DT.timedelta(days=7)  # add 7 days to the week date

            val_shipped = len(shipment_df[(shipment_df['loc'] == 'USW1') & (shipment_df['part'] == str(p)) & (shipment_df['shipped_date'] >= thisweek_start) & (shipment_df['shipped_date'] < thisweek_end)])

print(DT.datetime.now() - start).total_seconds()

shipment_df有大约35000条记录

partlist有436个部分

sub_week_list有5个日期

运行此代码总共花费了438.13秒

有没有更快的办法?你知道吗


Tags: indffordatetimedatedtstartnow
1条回答
网友
1楼 · 发布于 2024-05-16 10:07:28
parts_list = imp_parts_df['Parts'].astype(str).tolist()
i = 0
start = DT.datetime.now()
for p in parts_list:

    q = 'loc == "xxx" & part == @p & "2016-11-20" <= shipped_date < "2016-11-27"'
    val_shipped = len(shipment_df.query(q))

print (DT.datetime.now() - start).total_seconds()

相关问题 更多 >