基于多个日期条件过滤数据帧

2024-06-11 14:56:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用以下数据帧:

id  slotTime    EDD EDD-10M
0   1000000101068957    2021-05-12  2021-12-26  2021-02-26
1   1000000100849718    2021-03-20  2021-04-05  2020-06-05
2   1000000100849718    2021-03-20  2021-04-05  2020-06-05
3   1000000100849718    2021-03-20  2021-04-05  2020-06-05
4   1000000100849718    2021-03-20  2021-04-05  2020-06-05

我只想保留slotTime介于EDD-10MEDD之间的行:

df['EDD-10M'] < df['slotTime'] < df['EDD']]

我已尝试使用以下方法:

df.loc[df[df['slotTime'] < df['EDD']] & df[df['EDD-10M'] < df['slotTime']]]

但是,它会产生以下错误

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

请告知

要复制上述数据帧,请使用以下代码段:

import pandas as pd
from pandas import Timestamp

df = { 
  'id': {0: 1000000101068957,
  1: 1000000100849718,
  2: 1000000100849718,
  3: 1000000100849718,
  4: 1000000100849718,
  5: 1000000100849718,
  6: 1000000100849718,
  7: 1000000100849718,
  8: 1000000100849718,
  9: 1000000100849718},
  'EDD': {0: Timestamp('2021-12-26 00:00:00'),
  1: Timestamp('2021-04-05 00:00:00'),
  2: Timestamp('2021-04-05 00:00:00'),
  3: Timestamp('2021-04-05 00:00:00'),
  4: Timestamp('2021-04-05 00:00:00'),
  5: Timestamp('2021-04-05 00:00:00'),
  6: Timestamp('2021-04-05 00:00:00'),
  7: Timestamp('2021-04-05 00:00:00'),
  8: Timestamp('2021-04-05 00:00:00'),
  9: Timestamp('2021-04-05 00:00:00')},
 'EDD-10M': {0: Timestamp('2021-02-26 00:00:00'),
  1: Timestamp('2020-06-05 00:00:00'),
  2: Timestamp('2020-06-05 00:00:00'),
  3: Timestamp('2020-06-05 00:00:00'),
  4: Timestamp('2020-06-05 00:00:00'),
  5: Timestamp('2020-06-05 00:00:00'),
  6: Timestamp('2020-06-05 00:00:00'),
  7: Timestamp('2020-06-05 00:00:00'),
  8: Timestamp('2020-06-05 00:00:00'),
  9: Timestamp('2020-06-05 00:00:00')},
 'slotTime': {0: Timestamp('2021-05-12 00:00:00'),
  1: Timestamp('2021-03-20 00:00:00'),
  2: Timestamp('2021-03-20 00:00:00'),
  3: Timestamp('2021-03-20 00:00:00'),
  4: Timestamp('2021-03-20 00:00:00'),
  5: Timestamp('2021-03-20 00:00:00'),
  6: Timestamp('2021-03-20 00:00:00'),
  7: Timestamp('2021-03-20 00:00:00'),
  8: Timestamp('2021-03-20 00:00:00'),
  9: Timestamp('2021-03-20 00:00:00')}}

df = pd.DataFrame(df)

Tags: andtheto数据importidpandasdf
3条回答

您可以使用between()方法,也可以像这样尝试

df.loc[(df['EDD-10M'] < df['slotTime']) & (df['slotTime'] < df['EDD'])]

您应该使用(和)多个条件

你只需要把你的身体分成一组

df[(df['slotTime'] < df['EDD']) & (df['EDD-10M'] < df['slotTime'])]

否则,操作顺序将尝试&;事情先发生,一切都会分崩离析

或者,您可能希望使用.between运算符(假设您有一个日期时间序列)

df[df['slotTime'].between(df['EDD'],df['EDD-10M'])]

您可以使用^{}来实现这一点:

df.query("(slotTime < EDD) & (`EDD-10M` < slotTime)")

相关问题 更多 >