无法使用日期作为字符串对pandas数据帧（以日期为键）进行切片

3条回答

网友

1楼 · 编辑于 2024-05-19 13:50:31

首先，我更新了你的测试数据（仅供参考），因为它返回一个“无效令牌”错误。请查看此处的更改：

cbd=pd.date_range(start='2017-01-02',end='2017-01-30',period=1)
df = pd.DataFrame(data=None,columns=['Test1','Test2'],index=cbd)

现在看第一行：

^{pr2}$

然后尝试初始切片方法会产生以下错误：

In[2]:    

df['2017-01-02']

Out[2]:

KeyError: '2017-01-02'

现在使用column名称尝试此操作：

In[3]:    

df.columns

Out[3]:

Index(['Test1', 'Test2'], dtype='object')

In[4]:

我们尝试“测试1”：

df['Test1']

并从该列获取NaN输出。在

Out[4]:

2017-01-02    NaN
2017-01-03    NaN
2017-01-04    NaN
2017-01-05    NaN

因此，您所使用的格式被设计用于column名称，除非您使用此格式df['2017-01-02':'2017-01-02']。在

Pandas docs状态“下面的选择将引发一个KeyError；否则，此选择方法将与pandas中的其他选择方法不一致（因为这不是一个切片，也不能解析为一个切片）”。在

因此，在您正确识别的情况下，DataFrame.loc是一个基于标签的索引器，它生成您要查找的输出：

 In[5]:
df.loc['2017-01-02']

 Out[5]:

Test1    NaN
Test2    NaN
Name: 2017-01-02 00:00:00, dtype: object

网友

2楼 · 编辑于 2024-05-19 13:50:31

`df[]`

如果不在[]内使用:，则其内的值将被视为列。在
当您在[]内使用:，那么它里面的值将被视为行。在

为什么是双重性？

因为大多数时候人们希望对行进行切片，而不是对列进行切片。所以他们决定x，df[x:y]中的y应该对应于行，d[x]或x，df[[x,y]]中的{}应该对应于列。在

示例：

df = pd.DataFrame(data = [[1,2,3], [1,2,3], [1,2,3]],
                                 index = ['A','B','C'], columns = ['A','B','C'])
print df

输出：

^{pr2}$

现在，当你做df['B']时，它可能意味着两件事：

获取第二个索引B，并给您第二行1 2 3
^{3美元
取第二列B，然后给你第二列2 2 2。

所以为了解决这个冲突并保持它的明确性，df['B']总是意味着你想要列'B'，如果没有这样的列，它将抛出一个错误。在

为什么`df['2017-01-02']`失败？

它将搜索列'2017-01-02'，因为没有这样的列，它抛出一个错误。在

那么`df.loc['2017-01-02']`为什么起作用呢？

因为.loc[]的语法是df.loc[row,column]，如果愿意，可以省略该列，就像在您的例子中一样，它的意思是df.loc[row]

网友

3楼 · 编辑于 2024-05-19 13:50:31

有区别，因为使用不同的方法：

对于select，需要一行loc：

df['2017-01-02']

Docs - partial string indexing：

Warning
The following selection will raise a KeyError; otherwise this selection methodology would be inconsistent with other selection methods in pandas (as this is not a slice, nor does it resolve to one):

dft['2013-1-15 12:30:00']

To select a single row, use .loc

^{pr2}$

df['2017-01-02':'2017-01-02']

这是纯粹的partial string indexing：

This type of slicing will work on a DataFrame with a DateTimeIndex as well. Since the partial string selection is a form of label slicing, the endpoints will be included. This would include matching times on an included date.

`df[]`

为什么是双重性？

示例：

为什么`df['2017-01-02']`失败？

那么`df.loc['2017-01-02']`为什么起作用呢？

相关问题更多 >

编程相关推荐

热门问题

热门文章