我有pandas数据帧df1
和df2
(df1是vanila数据帧,df2由“STK-ID”和“RPT-Date”索引):
>>> df1
STK_ID RPT_Date TClose sales discount
0 000568 20060331 3.69 5.975 NaN
1 000568 20060630 9.14 10.143 NaN
2 000568 20060930 9.49 13.854 NaN
3 000568 20061231 15.84 19.262 NaN
4 000568 20070331 17.00 6.803 NaN
5 000568 20070630 26.31 12.940 NaN
6 000568 20070930 39.12 19.977 NaN
7 000568 20071231 45.94 29.269 NaN
8 000568 20080331 38.75 12.668 NaN
9 000568 20080630 30.09 21.102 NaN
10 000568 20080930 26.00 30.769 NaN
>>> df2
TClose sales discount net_sales cogs
STK_ID RPT_Date
000568 20060331 3.69 5.975 NaN 5.975 2.591
20060630 9.14 10.143 NaN 10.143 4.363
20060930 9.49 13.854 NaN 13.854 5.901
20061231 15.84 19.262 NaN 19.262 8.407
20070331 17.00 6.803 NaN 6.803 2.815
20070630 26.31 12.940 NaN 12.940 5.418
20070930 39.12 19.977 NaN 19.977 8.452
20071231 45.94 29.269 NaN 29.269 12.606
20080331 38.75 12.668 NaN 12.668 3.958
20080630 30.09 21.102 NaN 21.102 7.431
我可以通过以下方法得到最后3行df2:
>>> df2.ix[-3:]
TClose sales discount net_sales cogs
STK_ID RPT_Date
000568 20071231 45.94 29.269 NaN 29.269 12.606
20080331 38.75 12.668 NaN 12.668 3.958
20080630 30.09 21.102 NaN 21.102 7.431
当df1.ix[-3:]
给出所有行时:
>>> df1.ix[-3:]
STK_ID RPT_Date TClose sales discount
0 000568 20060331 3.69 5.975 NaN
1 000568 20060630 9.14 10.143 NaN
2 000568 20060930 9.49 13.854 NaN
3 000568 20061231 15.84 19.262 NaN
4 000568 20070331 17.00 6.803 NaN
5 000568 20070630 26.31 12.940 NaN
6 000568 20070930 39.12 19.977 NaN
7 000568 20071231 45.94 29.269 NaN
8 000568 20080331 38.75 12.668 NaN
9 000568 20080630 30.09 21.102 NaN
10 000568 20080930 26.00 30.769 NaN
为什么?如何获取df1
(不带索引的数据帧)的最后3行?
熊猫0.10.1
这是因为使用了整数索引(
ix
选择那些通过labelover-3而不是position的索引,这是按设计的:请参见integer indexing in pandas "gotchas"*)。*在较新版本的熊猫中,更喜欢loc或iloc来消除ix作为位置或标签的模糊性:
参见docs。
正如韦斯指出的,在这种特殊情况下,你应该只使用尾巴!
别忘了
DataFrame.tail
!e、 g.df1.tail(10)
如果是按位置切片,
__getitem__
(即用[]
切片)工作得很好,是我为这个问题找到的最简洁的解决方案。这与调用
df.iloc[-3:]
相同,例如(iloc
内部委托给__getitem__
)。另外,如果要查找每个组的最后N行,请使用^{} 和^{} :
相关问题 更多 >
编程相关推荐