解析SAS datetime到pandas datafram

2024-04-26 21:05:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我将用SAS生成的CSV加载到熊猫数据帧中。为了解析SAS时间,我创建了一个如下的解析器函数:

def parse_date(d):
    try:
        date = dt.timedelta(seconds=int(d)) + dt.datetime(1960, 1, 1)
        return date
    except ValueError:
        print("There was a problem parsing.")

现在,当我传递这个函数一个SAS datetime时,就像我的一个date列中的第一个值一样,它给出了我想要的输出:

^{pr2}$

但是,当我将函数作为日期分析器传递给pd.read_csv()时,我会得到值错误,如下所示:

def get_ods_reader():
    ods_reader = pd.read_csv("mycsv.csv", 
                             chunksize=200000, parse_dates=[6, 9, 10, 16],
                             dtype={"account_nbr": object, "REPOSSESSION_STATUS_CD": object},
                             converters={"repossession_ind": parse_int},
                             date_parser=parse_date)
    return ods_reader

# Getting the data types of all columns
chunk_dtypes = []
for chunk in get_ods_reader():
    print(chunk.head(5))
    chunk_dtypes.append(chunk.dtypes)

Out[10]:
There was a problem parsing.
There was a problem parsing.
There was a problem parsing.
There was a problem parsing.
There was a problem parsing.
There was a problem parsing.
There was a problem parsing.
...

Tags: csv函数dateparsedefdtodsreader