添加到Pandas DataFram时发生datetime64错误

import numpy import pandas as pd # We create a list of strings. time_str_arr = ['2017-06-30T13:51:15.854', '2017-06-30T13:51:16.250', '2017-06-30T13:51:16.452', '2017-06-30T13:51:16.659'] # Then we create a time array, rounded to 10ms (actually floored, # not rounded), everything seems to be fine here. rounded_time = numpy.array(time_str_arr, dtype="datetime64[10ms]") rounded_time # Then we create a Pandas DataFrame and assign the time array as a # column to it. The datetime64 is destroyed. d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']), 'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])} df = pd.DataFrame(d) df = df.assign(wrong_time=rounded_time) df

INSTALLED VERSIONS commit: None python: 3.6.1.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.1 pytest: 3.0.7 pip: 9.0.1 setuptools: 27.2.0 Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.3.0 sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.2.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.7 xlrd: 1.0.0 xlwt: 1.2.0 xlsxwriter: 0.9.6 lxml: 3.7.3 bs4: 4.6.0 html5lib: 0.999 sqlalchemy: 1.1.9 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

2条回答

网友

1楼 · 编辑于 2024-04-23 21:43:07

在我看来这是一个bug，因为显然numpy.datetime64在内部被强制转换为Timestamps。在

对于我的作品使用^{}：

df = df.assign(wrong_time=pd.to_datetime(rounded_time))
print (df)
   one  two              wrong_time
a  1.0  1.0 2017-06-30 13:51:15.850
b  2.0  2.0 2017-06-30 13:51:16.250
c  3.0  3.0 2017-06-30 13:51:16.450
d  NaN  4.0 2017-06-30 13:51:16.650

另一个解决方案是ns：

^{pr2}$

网友

2楼 · 编辑于 2024-04-23 21:43:07

我在Pandas Git存储库中打开了一个问题。Jeff Reback给出了一个建议的解决方案：我们不用创建奇怪的10ms datetime64对象，只需使用floor（）函数取整时间戳：

In [16]: # We create a list of strings. 
...: time_str_arr = ['2017-06-30T13:51:15.854', '2017-06-30T13:51:16.250',
...:                 '2017-06-30T13:51:16.452', '2017-06-30T13:51:16.659']

In [17]: pd.to_datetime(time_str_arr).floor('10ms')
Out[17]: DatetimeIndex(['2017-06-30 13:51:15.850000', '2017-06-30 13:51:16.250000', '2017-06-30 13:51:16.450000', '2017-06-30 13:51:16.650000'], dtype='datetime64[ns]', freq=None)

来自https://github.com/pandas-dev/pandas/issues/17183的解决方案

相关问题更多 >

编程相关推荐

热门问题

热门文章