在Pandas.Dataframe中访问邻近行

2 投票
1 回答
1913 浏览
提问于 2025-04-18 09:31

我正在尝试计算一系列数据的局部最大值和最小值:如果当前行的值比前一行和后一行的值都大或都小,就把它设为当前值,否则就设为NaN(表示不是一个数字)。除了这样做,还有没有更优雅的方法呢:

import pandas as pd
import numpy as np

rng = pd.date_range('1/1/2014', periods=10, freq='5min')
s = pd.Series([1, 2, 3, 2, 1, 2, 3, 5, 7, 4], index=rng)
df = pd.DataFrame(s, columns=['val'])
df.index.name = "dt"
df['minmax'] = np.NaN

for i in range(len(df.index)):
    if i == 0:
        continue
    if i == len(df.index) - 1:
        continue
    if df['val'][i] >= df['val'][i - 1] and df['val'][i] >= df['val'][i + 1]:
        df['minmax'][i] = df['val'][i]
        continue
    if df['val'][i] <= df['val'][i - 1] and df['val'][i] <= df['val'][i + 1]:
        df['minmax'][i] = df['val'][i]
        continue

print(df)

结果是:

                     val  minmax
dt                              
2014-01-01 00:00:00    1     NaN
2014-01-01 00:05:00    2     NaN
2014-01-01 00:10:00    3       3
2014-01-01 00:15:00    2     NaN
2014-01-01 00:20:00    1       1
2014-01-01 00:25:00    2     NaN
2014-01-01 00:30:00    3     NaN
2014-01-01 00:35:00    5     NaN
2014-01-01 00:40:00    7       7
2014-01-01 00:45:00    4     NaN

1 个回答

1

我们可以用 shiftwhere 来决定要给哪些值赋什么。重要的是,在比较数据时,我们需要用到位运算符 &|Shift 会返回一个向下移动了1行(默认情况下)或者根据你传入的值移动的 Series 或 DataFrame。

使用 where 时,我们可以传入一个布尔条件,第二个参数 NaN 表示如果条件为 False 的话,就给这个值赋值。

In [81]:

df['minmax'] = df['val'].where(((df['val'] < df['val'].shift(1))&(df['val'] < df['val'].shift(-1)) | (df['val'] > df['val'].shift(1))&(df['val'] > df['val'].shift(-1))), NaN)
df
Out[81]:
                     val  minmax
dt                              
2014-01-01 00:00:00    1     NaN
2014-01-01 00:05:00    2     NaN
2014-01-01 00:10:00    3       3
2014-01-01 00:15:00    2     NaN
2014-01-01 00:20:00    1       1
2014-01-01 00:25:00    2     NaN
2014-01-01 00:30:00    3     NaN
2014-01-01 00:35:00    5     NaN
2014-01-01 00:40:00    7       7
2014-01-01 00:45:00    4     NaN

撰写回答