我有一个长达60年的月温度异常数据的时间序列。我只想在温度异常大于0.5的时间序列中连续6个月或更长时间传递温度值。尽管我发现替换这些值很容易<;0.5对于NaN,我不确定如何替换温度为>;0.5,但只有2或3个连续值大于0.5。下面的片段:
time = [1950.04167, 1950.125 , 1950.20833, 1950.29167, 1950.375 ,
1950.45833, 1950.54167, 1950.625 , 1950.70833, 1950.79167,
1950.875 , 1950.95833, 1951.04167, 1951.125 , 1951.20833,
1951.29167, 1951.375 , 1951.45833, 1951.54167, 1951.625 ,
1951.70833, 1951.79167, 1951.875 , 1951.95833, 1952.04167,
1952.125 , 1952.20833, 1952.29167, 1952.375 , 1952.45833,
1952.54167, 1952.625 , 1952.70833, 1952.79167, 1952.875 ,
1952.95833, 1953.04167, 1953.125 , 1953.20833, 1953.29167,
1953.375 , 1953.45833, 1953.54167, 1953.625 , 1953.70833,
1953.79167, 1953.875 , 1953.95833, 1954.04167, 1954.125 ,
1954.20833, 1954.29167, 1954.375 , 1954.45833, 1954.54167,
1954.625 , 1954.70833, 1954.79167, 1954.875 , 1954.95833]
sst = [-1.67623 , -1.685853, -1.69083 , -1.61898 , -1.40235 ,
-1.097773, -0.835867, -0.718727, -0.694087, -0.785423,
-0.9312 , -1.01925 , -0.8868 , -0.48022 , -0.007597,
0.448647, 0.66546 , 0.852427, 0.922443, 1.14481 ,
1.291153, 1.338903, 0.993053, 0.68006, 0.493597,
0.500197, 0.528363, 0.515583, 0.418493, 0.168387,
-0.003403, 0.033933, 0.15759 , 0.113847, 0.019967,
0.111413, 0.372967, 0.623067, 0.763903, 0.909743,
0.990287, 1.01288 , 0.969407, 0.985817, 0.982607,
1.01244 , 1.039917, 1.11755, 1.044333, 0.799593,
0.3769 , 0.105033, -0.070743, -0.281483, -0.59861,
-0.875743, -0.88768 , -0.642517, -0.548043, -0.547057]
series = pd.Series(index=time,data=sst)
greater = series.where(cond=(series>= 0.5))
例如,我希望能够“传递”SST值,对应于1951.375到1951.95833和1953.125到1954.125的时间跨度,其中8个和13个连续值的SST分别大于0.5,但对于对应于1952.125至1952.29167的SST值,将SST值替换为NaN,其中只有3个连续值为>;0.5.
有什么建议吗?蒂亚
您可以使用
series.groupby(series.le(0.5).cumsum())
查找> 0.5
运行的长度,然后使用.apply()
替换太短的运行值.groupby
最后将最后一个<= 0.5
值集总,因此我们希望将其限制为5次以上的运行,并用np.nan
替换第一个值相关问题 更多 >
编程相关推荐