Pandas在不应该时返回NaT
[12 rows x 3 columns]
err [Timestamp('2013-05-30 23:00:00', tz=None), NaT, NaT]
err [Timestamp('2013-06-30 23:00:00', tz=None), 249.0, 60.0]
err [Timestamp('2013-07-31 23:00:00', tz=None), 161.0, 2.0]
err [Timestamp('2013-09-01 23:00:00', tz=None), 151.0, 11.0]
err [Timestamp('2013-09-04 23:00:00', tz=None), 14.0, 0.0]
err [Timestamp('2013-10-01 23:00:00', tz=None), 162.0, 64.0]
err [Timestamp('2013-11-01 00:00:00', tz=None), 281.0, 175.0]
err [Timestamp('2013-12-03 00:00:00', tz=None), 482.0, 168.0]
err [Timestamp('2014-01-02 00:00:00', tz=None), 378.0, nan]
err [Timestamp('2014-01-03 00:00:00', tz=None), NaT, NaT]
err [Timestamp('2014-02-03 00:00:00', tz=None), nan, 167.0]
err [Timestamp('2014-03-03 00:00:00', tz=None), 502.0, 167.0]
我的数据框是
time NTCS001G002 NTCS001W005
0 2013-05-30 23:00:00 NaN NaN
1 2013-06-30 23:00:00 249 60
2 2013-07-31 23:00:00 161 2
3 2013-09-01 23:00:00 151 11
4 2013-09-04 23:00:00 14 0
5 2013-10-01 23:00:00 162 64
6 2013-11-01 00:00:00 281 175
7 2013-12-03 00:00:00 482 168
8 2014-01-02 00:00:00 378 NaN
9 2014-01-03 00:00:00 NaN NaN
10 2014-02-03 00:00:00 NaN 167
11 2014-03-03 00:00:00 502 167
当我像这样遍历每一行时:
for index, row in diffs.iterrows(): print "err", row.tolist()
我不确定这些NaT是不是个bug。我觉得它们应该是NaN。Pandas能不能不返回NaT,如果不能的话,我该怎么检查它们,因为我需要在列表中替换掉它们。
谢谢
2 个回答
1
值 NaT
的意思是“不是一个时间”,就像 nan
对于数字值的意思一样。
你能告诉我你的数据框(data frame)里的数据类型吗?试着把这些列转换成浮点数(float)类型。
2
原因是,iterrows会把每一行转换成一个Series对象,而这个行数据会被转换成datetime64格式……
In [11]: pd.Series([pd.Timestamp('2014-01-03 00:00:00', tz=None), np.nan, np.nan])
Out[11]:
0 2014-01-03
1 NaT
2 NaT
dtype: datetime64[ns]