如何在值的DataFrame列中找到顺序数（递增递减）的最高计数

h_diff l_diff monotonic timestamp 2000-01-18 NaN NaN NaN 2000-01-19 2.75 2.93 1.0 2000-01-20 12.75 10.13 1.0 2000-01-21 -7.25 -3.31 0.0 2000-01-24 -1.50 -5.07 0.0 2000-01-25 0.37 -2.75 1.0 2000-01-26 1.07 7.38 1.0 2000-01-27 -1.19 -2.75 0.0 2000-01-28 -2.13 -6.38 0.0 2000-01-31 -7.00 -6.12 0.0

h_diff l_diff monotonic timestamp 2000-01-18 NaN NaN NaN 2000-01-19 2.75 2.93 1.0 2000-01-20 12.75 10.13 2.0 2000-01-21 -7.25 -3.31 0.0 2000-01-24 -1.50 -5.07 0.0 2000-01-25 0.37 -2.75 1.0 2000-01-26 1.07 7.38 2.0 2000-01-27 1.19 -2.75 3.0 2000-01-28 2.13 -6.38 4.0 2000-01-31 -7.00 -6.12 0.0

2条回答

网友

1楼 · 编辑于 2024-06-16 08:29:06

使用^{}+^{}

初始数据帧

            h_diff  l_diff
timestamp                 
2000-01-18     NaN     NaN
2000-01-19    2.75    2.93
2000-01-20   12.75   10.13
2000-01-21   -7.25   -3.31
2000-01-24   -1.50   -5.07
2000-01-25    0.37   -2.75
2000-01-26    1.07    7.38
2000-01-27    1.19   -2.75
2000-01-28    2.13   -6.38
2000-01-31   -7.00   -6.12

h = df['h_diff'].gt(0)
#h = np.sign(df['h_diff'])
df['monotonic_h']=h.groupby(h.ne(h.shift()).cumsum()).cumcount().add(1).where(h,0)
print(df)
            h_diff  l_diff  monotonic_h
timestamp                             
2000-01-18     NaN     NaN            0
2000-01-19    2.75    2.93            1
2000-01-20   12.75   10.13            2
2000-01-21   -7.25   -3.31            0
2000-01-24   -1.50   -5.07            0
2000-01-25    0.37   -2.75            1
2000-01-26    1.07    7.38            2
2000-01-27    1.19   -2.75            3
2000-01-28    2.13   -6.38            4
2000-01-31   -7.00   -6.12            0

df['monotonic_h'].max()
#4

细节

h.ne(h.shift()).cumsum()

timestamp
2000-01-18    1
2000-01-19    2
2000-01-20    2
2000-01-21    3
2000-01-24    3
2000-01-25    4
2000-01-26    4
2000-01-27    4
2000-01-28    4
2000-01-31    5
Name: h_diff, dtype: int64

更新

df = df.join( h.groupby(h.ne(h.shift()).cumsum()).cumcount().add(1)
               .to_frame('values')
               .assign(monotic = np.where(h,'monotic_h_greater_0',
                                          'monotic_h_not_greater_0'),
                       index = lambda x: x.index)
               .where(df['h_diff'].notna())
               .pivot_table(columns = 'monotic',
                            index = 'index',
                            values = 'values',
                            fill_value=0) )

print(df)
            h_diff  l_diff  monotic_h_greater_0  monotic_h_not_greater_0
timestamp                                                               
2000-01-18     NaN     NaN                  NaN                      NaN
2000-01-19    2.75    2.93                  1.0                      0.0
2000-01-20   12.75   10.13                  2.0                      0.0
2000-01-21   -7.25   -3.31                  0.0                      1.0
2000-01-24   -1.50   -5.07                  0.0                      2.0
2000-01-25    0.37   -2.75                  1.0                      0.0
2000-01-26    1.07    7.38                  2.0                      0.0
2000-01-27    1.19   -2.75                  3.0                      0.0
2000-01-28    2.13   -6.38                  4.0                      0.0
2000-01-31   -7.00   -6.12                  0.0                      1.0

网友

2楼 · 编辑于 2024-06-16 08:29:06

下面的代码应该能够找到连续出现的正数或负数。下面的代码用于列h_diff

df1[df1.h_diff.gt(0)].index.to_series().diff().ne(1).cumsum().value_counts().max() #sequential occurrences greater than 0

df1[df1.h_diff.lt(0)].index.to_series().diff().ne(1).cumsum().value_counts().max() #sequential occurrences less than 0

相关问题更多 >

编程相关推荐

热门问题

热门文章