从文件名中查找缺少的业务日期

2024-05-15 20:05:04 发布

您现在位置:Python中文网/ 问答频道 /正文

自20210212年以来,我有一个带有营业日期(银行假日/公共假日也可以是营业日)的文件名列表,其中一些营业日期缺失。有人能告诉我如何得到丢失的营业日期吗。营业日期从星期一到星期五

list = ['20210212d_filename0','20210215d_filename1','20210218d_filename3','20210217d_filename4']
output should be
20210216

Tags: 列表output文件名银行belistshouldfilename1
2条回答

逻辑很简单。首先,您必须从这些字符串中提取日期。然后将这些日期转换为日期时间格式。这将给出可用日期的列表。现在我们必须找到这些日期之间所有缺失的日期,这些日期位于Monday to Friday

pd.date_range()函数的帮助下,我们可以列出从开始日期到结束日期的所有日期。在获得所有日期列表后,我们可以检查这些日期是否在Monday to Friday中。如果是这样的话,我们可以将其附加到最终列表中,该列表只包含非工作日的日期

如果上面的任何日期不在初始列表中,我们可以将其附加到缺少的日期列表中

完整代码

import pandas as pd
from datetime import timedelta
import datetime

list1 = ['20210212d_filename0','20210215d_filename1','20210218d_filename3','20210217d_filename4']

list2 = [s[:8] for s in list1]  #taking the dates from strings

list3 = []
for e in list2:
  e = e[:4]+"-"+e[4:6]+"-"+e[6:8] #formating the string to yyyy-mm-dd
  list3.append(e)

list4 = [datetime.datetime.strptime(e, '%Y-%m-%d').date() for e in list3] #converting strings to dates

#getting all the dates from first date to last
all_dates = pd.date_range(list4[0],list4[-1]-timedelta(days=1),freq='d').date.tolist() 

final_list=[]
days = ['Monday', 'Tuesday', 'Wednesday','Thrusday', 'Friday']
for date in all_dates:
  #append only if they are between monday to friday
  if date.strftime('%A') in days:
    final_list.append(str(date))

final_absent_dates = []
for date in final_list:
  if date not in list3: #if dates are not present in list3
    final_absent_dates.append(date)

final_absent_dates= [e[:4]+e[5:7]+e[8:10] for e in final_absent_dates] #formatting date to string

final_absent_dates

#>>>output = ['20210216']

您需要导入两个模块datetime^{}。然后使用python日期操作检查日期是否为工作日

我对每一行都做了注释,因此您可以很容易地理解代码

import holidays
import datetime

filelist = ['20210212d_filename0','20210215d_filename1','20210218d_filename3','20210217d_filename4']

#sort the list in ascending order of filenames
#this gives the earliest date as first item
filelist.sort()

#get the earliest date and last date to iterate through
first_day = datetime.date(int(filelist[0][:4]),int(filelist[0][4:6]),int(filelist[0][6:8]))
last_day = datetime.date(int(filelist[-1][:4]),int(filelist[-1][4:6]),int(filelist[-1][6:8]))

#store all the US holidays in a variable to check
bizholidays = holidays.US()

#set next_day to first day
next_day = first_day
#iterate until the last date in the file list
while next_day <= last_day:
    #check if day is Mon thru Fri (0=Mon, 1=Tue... 4=Fri, 5=Sat, 6=Sun)
    #check if day is NOT part of business holidays
    #check if date string of YYYYMMDD is not part current list
    #if all of them satisfy, then date is missing in file list
    if next_day.weekday() < 5 and next_day not in bizholidays and \
    not any(next_day.strftime("%Y%m%d") in fname for fname in filelist):

        #date is not part of the file so print the date
        print (f'File for working day : {next_day} missing')

    #increment to next day using timedelta(1)
    next_day += datetime.timedelta(1)

其输出为:

File for working day : 2021-02-16 missing

我不知道你在哪里

以下是一些供您考虑的选项:

印度假日:

bizholidays = holidays.IN() #India

print (bizholidays)

{datetime.date(2021, 1, 14): 'Makar Sankranti / Pongal', datetime.date(2021, 1, 26): 'Republic Day', datetime.date(2021, 8, 15): 'Independence Day', datetime.date(2021, 10, 2): 'Gandhi Jayanti', datetime.date(2021, 5, 1): 'Labour Day', datetime.date(2021, 12, 25): 'Christmas'}

英国假日

bizholidays = holidays.UK() #United Kingdom

print (bizholidays)

{datetime.date(2021, 1, 1): "New Year's Day", datetime.date(2021, 1, 2): 'New Year Holiday [Scotland]', datetime.date(2021, 1, 4): 'New Year Holiday [Scotland] (Observed)', datetime.date(2021, 3, 17): "St. Patrick's Day [Northern Ireland]", datetime.date(2021, 4, 2): 'Good Friday', datetime.date(2021, 4, 5): 'Easter Monday [England, Wales, Northern Ireland]', datetime.date(2021, 5, 3): 'May Day', datetime.date(2021, 5, 31): 'Spring Bank Holiday', datetime.date(2021, 7, 12): 'Battle of the Boyne [Northern Ireland]', datetime.date(2021, 8, 2): 'Summer Bank Holiday [Scotland]', datetime.date(2021, 8, 30): 'Late Summer Bank Holiday [England, Wales, Northern Ireland]', datetime.date(2021, 11, 30): "St. Andrew's Day [Scotland]", datetime.date(2021, 12, 25): 'Christmas Day', datetime.date(2021, 12, 27): 'Christmas Day (Observed)', datetime.date(2021, 12, 26): 'Boxing Day', datetime.date(2021, 12, 28): 'Boxing Day (Observed)'}

加拿大假日:

bizholidays = holidays.CA() #Canada

print (bizholidays)

{datetime.date(2021, 1, 1): "New Year's Day", datetime.date(2021, 12, 31): "New Year's Day (Observed)", datetime.date(2021, 2, 15): 'Family Day', datetime.date(2021, 4, 2): 'Good Friday', datetime.date(2021, 5, 24): 'Victoria Day', datetime.date(2021, 7, 1): 'Canada Day', datetime.date(2021, 8, 2): 'Civic Holiday', datetime.date(2021, 9, 6): 'Labour Day', datetime.date(2021, 10, 11): 'Thanksgiving', datetime.date(2021, 12, 25): 'Christmas Day', datetime.date(2021, 12, 24): 'Christmas Day (Observed)', datetime.date(2021, 12, 27): 'Boxing Day (Observed)'}

您可以提供两个字符的国家代码来获取假期,然后对照它进行检查

如果您在默认项中找不到工作假期,并且希望添加特定的天数,您也可以这样做

假设您想将这4天添加到您的列表中。您可以按如下所示进行操作:

#You can also add custom holidays and check for them
custom_holidays = holidays.US()
custom_holidays.append({"2021-02-14": "Valentines Day"})
custom_holidays.append(['2021-07-01', '2021-02-16','09/30/2021'])

并替换代码以检查自定义假日,它将正常工作

if next_day.weekday() < 5 and next_day not in custom_holidays

相关问题 更多 >