将文本文件拆分为多个文件并上载到数据帧

Table 501 ---------------------------------------------------------------- |Sale|Di|Dv|Cus |Mat |Valid From|Valid to | ---------------------------------------------------------------- |88|01|02|dd|20300 |24.05.2012|31.12.9999| |889|01|02|dd|20300 |24.05.2012|31.12.9999| |890|01|02|dd|20300 |24.05.2012|31.12.9999| ---------------------------------------------------------------- Table 55 --------------------------------------------------------- |Sale|Di|Dv|Cus |Grou|S|Valid From|Valid to | --------------------------------------------------------- |4500|44|55|A|01560 | |11.02.2019|31.12.9999| |4500|44|55|BBB|55070 | |30.04.2018|31.12.9999| |4500|44|55|D|55080 | |30.04.2018|31.12.9999| |4500|44|55|D|55420 | |30.04.2018|31.12.9999| |4500|44|55|8834496 |55450 | |30.04.2018|31.12.9999| --------------------------------------------------------- Table 065 ---------------------------------------------------------------- |Sale|Di|Dv|Cus |Mat |Valid From|Valid to | ---------------------------------------------------------------- |4500|44|55|bbbb |01000 |29.05.2013|31.12.9999| ----------------------------------------------------------------

1条回答

网友

1楼 · 发布于 2024-06-02 04:34:48

在将9999年转换为datetime对象时，尝试一些操作和错误处理

import pandas as pd

with open("0400.txt", "r") as f:
    lines = [
        [y.strip() for y in x.split("|")] 
        for x in f.readlines() if not x.startswith(" -")]

df = pd.DataFrame(lines[1:], columns=lines[0])
df["Valid to"] = pd.to_datetime(df["Valid to"], errors="coerce").fillna(pd.Timestamp.max.date())
df["Valid From"] = pd.to_datetime(df["Valid From"], errors="coerce")
print(df)

     Sale  Di  Dv      Cus    Mat Valid From    Valid to  
0    0400  01  02  1327260  20300 2012-05-24  2262-04-11  
1    0400  01  02  1327260  20300 2012-05-24  2262-04-11  
2    0400  01  02  1327260  20300 2012-05-24  2262-04-11

相关问题更多 >

编程相关推荐

热门问题

热门文章