想看两列日期，但只上

import pandas as pd CSV = 'text.csv' df = pd.read_csv(CSV, skiprows = 0, na_values = 0, parse_dates = ['Date of Sign Up', 'Birth Date'], usecols = ['Date of Sign Up', 'A', 'B', 'C', 'D', 'Birth Date']) df.info() # Check info for column types and nan...

RangeIndex: 969 entries, 0 to 968 Data columns (total 6 columns): Date of Sign Up 969 non-null datetime64[ns] A 969 non-null object B 969 non-null object C 969 non-null object D 969 non-null object Birth Date 969 non-null object ## <== Why doesn't this column read as datetime? dtypes: datetime64[ns](1), object(5) memory usage: 45.5+ KB

1条回答

网友

1楼 · 发布于 2024-06-02 05:24:36

有一个问题，Birth Date中的某些值至少包含一个不可解析的datetime，因此read_csv不会自动解析列

您可以通过以下方式检查此值：

dates = pd.to_datetime(df['Birth Date'], errors='coerce')

print (df.loc[dates.isnull(), 'Birth Date'])

另一个解决方案是将这个有问题的值解析为NaT：

df['Birth Date'] = pd.to_datetime(df['Birth Date'], errors='coerce')

我尝试测试0是否正确解析为NaT：

import pandas as pd

temp=u"""Date,a
2017-04-03,0
2017-04-04,1
0,2
2017-04-06,3
2017-04-07,4
2017-04-08,5"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), na_values = 0, parse_dates=['Date'])

print (df)
        Date    a
0 2017-04-03  NaN
1 2017-04-04  1.0
2        NaT  2.0
3 2017-04-06  3.0
4 2017-04-07  4.0
5 2017-04-08  5.0

print (df.dtypes)

Date    datetime64[ns]
a              float64
dtype: object

如果有一些不可解析的值：

import pandas as pd

temp=u"""Date,a
2017-04-03,0
string,1
0,2
2017-04-06,3
2017-04-07,4
2017-04-08,5"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), na_values = [0, 'string'], parse_dates=['Date'])

print (df)
        Date    a
0 2017-04-03  NaN
1        NaT  1.0
2        NaT  2.0
3 2017-04-06  3.0
4 2017-04-07  4.0
5 2017-04-08  5.0

print (df.dtypes)
Date    datetime64[ns]
a              float64
dtype: object

相关问题更多 >

编程相关推荐

热门问题

热门文章