获取数据帧中满足某些条件的行

2024-04-19 23:14:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要根据代码(其中serial为2)获取行 value of new (where serial is 2) < value of sold (where serial is 2)

公式:new[2] >= sold[1] and new[2] < sold[2],[2]/[1]是序列号(这就是为什么我试图将索引放在序列号上)

示例数据帧(数据):data.set_index('serial')

^{tb1}$

它给出了一个错误:

File "C:#########\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: True

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "F:\python_projects#########\testing.py", line 12, in status = globals()[f"strategy_{row[0]}"].pre_check_condition_panda(5,now)#(row[2],now) File "F:\python_projects#########\strategy.py", line 88, in pre_check_condition_panda data1 = data[(data.open[2] >= data.close[1]) & (data.open[2] > data.close[2])] File "C:#########\Python39\lib\site-packages\pandas\core\frame.py", line 3455, in getitem indexer = self.columns.get_loc(key) File "C:#########\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: True

我的代码:

data1 = data[(data.new[2] >= data.sold[1]) & (data.new[2] < data.sold[2])]
print(data1)

预期结果:

^{tb2}$

Tags: ofinpypandasnewdatagetindex
1条回答
网友
1楼 · 发布于 2024-04-19 23:14:24

使用stack/unstack可以获得与需求匹配的“代码”,但仅获取serial==2的条件不清楚:

df2 = df.set_index(['code', 'serial']).unstack()
(df2.loc[df2[('new', 2)].ge(df2[('sold', 1)])
        &df2[('new', 2)].lt(df2[('sold', 2)])
        ]
    .stack(level=1)
    .reset_index()
)

输出:

    code  serial        date     new    sold
0  20286       1  2019-01-30  590.55  590.15
1  20286       2  2019-02-30  590.15  590.55
2  20286       3  2019-03-30  590.40  590.15

如果您只想要serial==2,可以添加.query('serial == 2')

df2 = df.set_index(['code', 'serial']).unstack()
(df2.loc[df2[('new', 2)].ge(df2[('sold', 1)])
        &df2[('new', 2)].lt(df2[('sold', 2)])
        ]
    .stack(level=1)
    .reset_index()
    .query('serial == 2')
)

输出:

    code  serial        date     new    sold
1  20286       2  2019-02-30  590.15  590.55

相关问题 更多 >