在numpy/pandas中查找特定值前的最后值

3 投票

3 回答

917 浏览

提问于 2025-05-01 14:24

我有一个 pandas 的序列，我想找到某个值在另一个特定值之前最后出现的索引或位置（或者一个布尔掩码）。

举个例子，假设有：

df = pd.DataFrame({'x':np.random.randint(10, 1000000)})

我想找到在 9 之前最后出现的所有 0 的位置。所以如果我的数组是

[9, 0, 3, 0, 1, 9, 4, 9, 0, 0, 9, 4, 0]

那么只有位置 3 和 9 的 0 是我关心的。注意，我不太在意位置 12 的最后一个 0 会发生什么。我更希望返回的结果中不包含它，但这不是特别重要。

我现在的方法是这样的：

df['last'] = np.nan
df.loc[df.x == 0, 'last'] = 0.0
df.loc[df.x == 9, 'last'] = 1.0
df.last.fillna(method='bfill', inplace=True)
df.loc[df.x == 0, 'last'] = np.nan
df.last.fillna(method='bfill', inplace=True)
df.last.fillna(value=0.0, inplace=True)
df.loc[df.x != 0, 'last'] = 0.0

有没有人有更快或者更简洁的方法？

暂无标签

3 个回答

我觉得这个方法适用于一般的输入：

def find_last_a_before_b(arr, a, b):
    arr = np.asarray(arr)
    idx_a, = np.where(arr == a)
    idx_b, = np.where(arr == b)
    iss = idx_b.searchsorted(idx_a)
    mask = np.concatenate((iss[1:] != iss[:-1],
                           [True if iss[-1] < len(idx_b) else False]))
    return idx_a[mask]

>>> find_last_a_before_b([9, 0, 3, 0, 1, 9, 4, 9, 0, 0, 9, 4, 0], 0, 9)
array([3, 9])
>>> find_last_a_before_b([9, 0, 3, 0, 1, 9, 4, 9, 0, 0, 9, 4, 0], 9, 0)
array([ 0,  7, 10])

关键在于使用 np.searchsorted 这个函数，它可以帮助我们找出在某个0后面哪个9出现。接着，我们要去掉重复的数字，如果最后一个数字后面没有9的话，也要把它去掉。

回答于 2025-05-01 由 Python大师

分享举报

你可以使用布尔索引和 shift 函数。举个例子：

>>> s = pd.Series([9, 0, 3, 0, 9, 4, 9, 0, 0, 9, 4, 0])
>>> s[(s == 0) & (s.shift(-1) == 9)]
3    0
8    0
dtype: int64

这个方法可以找到在 s 中值为 0 的位置，并且这些 0 后面紧跟着 9。

编辑: 稍微调整了一下，让我们可以在 9 和最后一个前面的 0 之间允许有其他值（也可以看看 @acushner 的回答）...

这里有一个稍微修改过的序列 s; 我们仍然想要索引为 3 和 8 的 0：

>>> s = pd.Series([9, 0, 3, 0, 9, 4, 9, 0, 0, 4, 9, 0])
>>> t = s[(s == 0) | (s == 9)]
>>> t
0     9
1     0
3     0
4     9
6     9
7     0
8     0
10    9
11    0

t 是一个包含 s 中所有 9 和 0 的序列。我们可以像之前一样获取相关的索引：

>>> t[(t == 0) & (t.shift(-1) == 9)]
3    0
8    0
dtype: int64

回答于 2025-05-01 由 Python大师

分享举报

简单来说，就是在调整 @ajcr 的回答：

s = pd.Series([9, 0, 3, 0, 1, 9, 4, 9, 0, 0, 9, 4, 0]) #using your example array
s = s[s.isin([0,9])]
s[(s == 0) & (s.shift(-1) == 9)]

回答于 2025-05-01 由 Python大师

分享举报

在numpy/pandas中查找特定值前的最后值

3 个回答

撰写回答