<p>我处理这个问题的方法是生成两列,其中包含检查的条件(相同的heure和连续递增的NoDemande)。然后迭代数据帧,根据Fait列删除不需要的对。在</p>
<p>这是一个有点骇人听闻的代码,但这似乎能做到:</p>
<pre><code># Recreate DataFrame
df = pd.DataFrame({
'NoDemande': [23, 22, 22, 23, 34, 34, 61, 62, 62, 61],
'HeurePrevue': [84, 84, 93, 93, 64, 73, 84, 84, 93, 93],
'Fait': [1, 1, -99, 1, 1, 1, -1, 1, -99, -11]
}, columns=['NoDemande', 'Fait', 'HeurePrevue'])
# Make columns which contain conditions for inspection
df['sameHeure'] = df.HeurePrevue.iloc[1:] == df.HeurePrevue.iloc[:-1]
df['cont'] = df.NoDemande.diff()
# Cycle over rows
for prev_row, row in zip(df.iloc[:-1].itertuples(), df.iloc[1:].itertuples()):
if row.sameHeure and (row.cont == 1): # If rows are continuous and have the same Heure delete a pair
pair_1 = df.loc[df.NoDemande == row.NoDemande]
pair_2 = df.loc[df.NoDemande == prev_row.NoDemande]
if sum(pair_1.Fait > 0) < sum(pair_2.Fait > 0): # Find which pair to delete
df.drop(pair_1.index, inplace=True)
else:
df.drop(pair_2.index, inplace=True)
df.drop(['cont', 'sameHeure'], 1, inplace=True) # Throw away the added columns
</code></pre>
<p>结果:</p>
^{pr2}$