为什么我的python/pandas代码没有正确过滤终止日期?

2024-05-15 03:37:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我的python脚本导入一个xlsx文件,去掉一些id,然后根据我的“term\u date”变量过滤掉终止日期。由于今天是6月12日,我不希望我的输出中出现任何2018年3月14日之后的终止日期。不过,我看到了2月份的终止日期。知道为什么吗?你知道吗

import pandas as pd
from datetime import datetime, timedelta

TODAY = datetime.today().strftime("%d%m%Y")
term_date = (datetime.today() - timedelta(days=90))
#term_date = (pd.to_datetime('today') - pd.Timedelta(days=90))
remove_id = ['381998','201439']

df = pd.read_excel('Details.xlsx')
df = df[~df['Employee ID'].isin(remove_id)]
df['Termination Date'] = df['Termination Date'].astype(str)

df['Termination Date'] = df['Termination Date'].str.replace('nan', '1/1/2050')
df['Termination Date'] = pd.to_datetime(df['Termination Date'])
df['Hire Date'] = pd.to_datetime(df['Hire Date'])
df['Home Address Line 1'] = df['Home Address Line 1'].str.replace(',', '')
df['Home Address Line 2'] = df['Home Address Line 2'].str.replace(',', '')
df['Shipping Address Line 1'] = df['Shipping Address Line 1'].str.replace(',', '')
df['Shipping Address Line 2'] = df['Shipping Address Line 2'].str.replace(',', '')
df2 = df[df['Termination Date'] >= term_date]

df2.to_excel('roster_file2_' + TODAY + '.xlsx')

我的数据帧示例:

Employee ID Termination Date    Hire Date   Home Address Line 1
234254              2/1/2018    1/1/2015    20 Main St
675867              5/2/2018    1/1/2015    10 Elm St
345665              1/1/2050    1/1/2015    1 Chestnut St
974445              1/1/2050    1/1/2015    12 Cherry St
235465             11/3/2017    1/1/2015    9 Lucky St

Tags: todfhomedatetimedateaddresslinereplace
1条回答
网友
1楼 · 发布于 2024-05-15 03:37:16

它看起来像是日期时间格式的问题。 转换为日期时间时,请尝试传递dayfirst=True

df.TerminationDate = pd.to_datetime(df.TerminationDate,dayfirst=True)

df[df.TerminationDate >= term_date]
Out[519]: 
EmployeeID TerminationDate    HireDate   HomeAddress
2      345665      2050-01-01  01/01/2015  1 ChestnutSt
3      974445      2050-01-01  01/01/2015   12 CherrySt

相关问题 更多 >

    热门问题