Pandas:通过按d筛选访问行

2024-05-23 18:54:23 发布

您现在位置:Python中文网/ 问答频道 /正文

执行时:

import pandas
from datetime import datetime
timestampparse = lambda t: datetime.fromtimestamp(float(t))
df = pandas.read_csv('blah.csv', delimiter=';', parse_dates=True, date_parser=timestampparse, index_col='DateTime', names=['DateTime', 'Sell'], header=None)
print df.ix['2015-12-02 12:02:21.070':'2015-12-02 12:40:21.070']

用这个布拉赫.csv文件:

1449054136.83;1.05905
1449054139.25;1.05906
1449054139.86;1.05906
1449054140.47;1.05906

我得到这个错误:

KeyError

如何访问按日期筛选的数据帧片段?

为什么df.ix['2015-12-02 12:02:19.000':'2015-12-02 12:40:21.070']不起作用?


Tags: csvlambdafromimportpandasdfreaddatetime
3条回答

docsTime/Date Components我知道您需要指定微秒的数量(与datetime对象相同):

In [103]: df.loc["2015-12-02 14:02:10":"2015-12-02 14:02:19.899999"]
Out[103]:
                               Sell
DateTime
2015-12-02 14:02:16.829999  1.05905
2015-12-02 14:02:19.250000  1.05906
2015-12-02 14:02:19.859999  1.05906

或者使用datetime精确指定微秒数:

In [104]: df.loc["2015-12-02 14:02:10":datetime(year=2015, month=12, day=2, hour=14, minute=2, second=20, microsecond=999999)]
Out[104]:
                               Sell
DateTime
2015-12-02 14:02:16.829999  1.05905
2015-12-02 14:02:19.250000  1.05906
2015-12-02 14:02:19.859999  1.05906
2015-12-02 14:02:20.470000  1.05906

用零填充第二个分数'2015-12-02 12:02:16.0859'

>>> df['2015-12-02 12:02:16.0859':'2015-12-02 12:03:20'])
                              Sell
DateTime                           
2015-12-02 12:02:16.829999  1.05905
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906
2015-12-02 12:02:20.470000  1.05906

这样做有效:

>>> df['2015-12-02 12:02:17':'2015-12-02 12:03:20']
                               Sell
DateTime                           
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906
2015-12-02 12:02:20.470000  1.05906

这适用于版本0.16.2

>>> from datetime import datetime
>>> df[datetime(2015, 12, 2, 12, 2, 16):datetime(2015, 12, 2, 12, 2, 20)]

                               Sell
DateTime                           
2015-12-02 12:02:16.829999  1.05905
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906

我认为它不起作用,因为在datetimeindexfloatindex中可能存在精度问题。你知道吗

您可以使用partial string indexing,其中我省略了datetime结尾的数字-我只使用秒:

print df['2015-12-02 12:02:19':'2015-12-02 12:40:20']

                            Sell
DateTime                        
2015-12-02 12:02:19.250  1.05906
2015-12-02 12:02:19.860  1.05906
2015-12-02 12:02:20.470  1.05906

相关问题 更多 >