<p>重温这一点-我想我会添加这个替代方法,其中包括我在<a href="https://stackoverflow.com/a/65025605/1431750">previous answer</a>中提到的优化:</p>
<p>这个版本要长得多,需要手工编码每个正数序列的索引跟踪。但它有以下提到的主要性能优势</p>
<pre class="lang-py prettyprint-override"><code>a = np.array([1.01, -1.58, 0.64, 1.38, 0.69, 0.91, 1.34, 1.03, 1.39, 0.94, -1.01, 0.16,])
b = np.zeros(len(a))
MIN_CONSECUTIVE = 6 # number of consecutive numbers wanted
first, last, c = -1, -1, 0 # index of first & last positive number, count of +ve nums
nums = [] # hold each series of consecutive positive numbers
for i, n in enumerate(a):
if n > 0:
nums.append(n)
first = i if first == -1 else first
last = i
c += 1
else:
if c >= MIN_CONSECUTIVE:
b[first:last+1] = nums
first, last, c = -1, -1, 0 # reset
nums = [] # reset
if i > len(a) - MIN_CONSECUTIVE - 1: # shortcut exit if there aren't
break # enough elems left for consecutive
else: # if the loop completes without `break`
if c >= MIN_CONSECUTIVE: # also check after the loop completes
b[first:last+1] = nums
b # -> array([0. , 0. , 0.64, 1.38, 0.69, 0.91, 1.34, 1.03, 1.39, 0.94, 0. , 0. ])
</code></pre>
<p>为什么这样更好:</p>
<ol>
<li>这个函数只访问numpy数组(或python列表)中的每个项目一次。
<ul>
<li>您尝试的方法,以及在注释和我之前的回答中提供的方法,在数组中多次查找<em>重叠的</em>元素序列</li>
<li>前一个也会多次重新写入<code>b</code>,并使用相同的值进行重复写入</li>
</ul>
</li>
<li>这只会在找到并完成每个系列后写入<code>b</code></李>
<li>我用<a href="https://docs.python.org/3/library/functions.html#enumerate" rel="nofollow noreferrer">^{<cd3>}</a>跟踪索引,但我不使用索引访问元素。python列表上的索引访问比只遍历列表元素要慢<a href="https://stackoverflow.com/a/29311751/1431750">The same is true for numpy arrays.</a></li>
<li>对于给定的列表,此列表的运行时间不到上一个答案的一半:
<ul>
<li>该值:14.6µs±584 ns</li>
<li>上一个:31.9µs±1.13µs</li>
</ul>
</li>
</ol>