如何在pandas/numpy中将值扩展到下一个非空值?

2024-04-18 04:30:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个Series

>>> s = pd.Series([1,0,0,3,0,5,0,0,0])
>>> s[s==0] = pd.np.nan
>>> s
0    1.0
1    NaN
2    NaN
3    3.0
4    NaN
5    5.0
6    NaN
7    NaN
8    NaN
dtype: float64

我想“扩展”这些值,如下所示:

>>> t = s.shift()
>>> for _ in range(100000):
...     s[s.isnull()] = t
...     if not s.isnull().any():
...             break
...     t = t.shift()
...
>>> s
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64

但是我想要更矢量化和更高效的东西。我该怎么做?你知道吗


Tags: inforifshiftnpnotanyrange
2条回答

您正在寻找fillna

>>> s.fillna(method='ffill')
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64
>>>

基于^{}-

def numpy_ffill(s):
    arr = s.values
    mask = np.isnan(arr)
    idx = np.where(~mask,np.arange(len(mask)),0)
    out = arr[np.maximum.accumulate(idx)]
    return pd.Series(out)

样本运行-

In [41]: s
Out[41]: 
0    1.0
1    NaN
2    NaN
3    3.0
4    NaN
5    5.0
6    NaN
7    NaN
8    NaN
dtype: float64

In [42]: numpy_ffill(s)
Out[42]: 
0    1.0
1    1.0
2    1.0
3    3.0
4    3.0
5    5.0
6    5.0
7    5.0
8    5.0
dtype: float64

相关问题 更多 >