擅长:python、mysql、java
<p>我通过<code>ne</code>而不是使用实际的<code>!=</code>比较获得更好的性能:</p>
<pre><code>df['changed'] = df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int)
</code></pre>
<p><strong>计时</strong></p>
<p>使用以下设置生成更大的数据帧:</p>
<pre><code>df = pd.concat([df]*10**5, ignore_index=True)
</code></pre>
<p>我有以下时间安排:</p>
<pre><code>%timeit df['ColumnB'].ne(df['ColumnB'].shift().bfill()).astype(int)
10 loops, best of 3: 38.1 ms per loop
%timeit (df.ColumnB != df.ColumnB.shift()).astype(int)
10 loops, best of 3: 77.7 ms per loop
%timeit df['ColumnB'] == df['ColumnB'].shift(1).fillna(df['ColumnB'])
10 loops, best of 3: 99.6 ms per loop
%timeit (df.ColumnB.ne(df.ColumnB.shift())).astype(int)
10 loops, best of 3: 19.3 ms per loop
</code></pre>