擅长:python、mysql、java
<p>使用带列列表的<code>drop_duplicates</code>和<code>subset</code>检查重复项,并使用<code>keep='first'</code>保留第一个重复项。</p>
<p>如果<code>dataframe</code>是:</p>
<pre><code>df = pd.DataFrame({'Column1': ["'cat'", "'toy'", "'cat'"],
'Column2': ["'bat'", "'flower'", "'bat'"],
'Column3': ["'xyz'", "'abc'", "'lmn'"]})
print(df)
</code></pre>
<p>结果:</p>
<pre><code> Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
2 'cat' 'bat' 'lmn'
</code></pre>
<p>然后:</p>
<pre><code>result_df = df.drop_duplicates(subset=['Column1', 'Column2'], keep='first')
print(result_df)
</code></pre>
<p>结果:</p>
<pre><code> Column1 Column2 Column3
0 'cat' 'bat' 'xyz'
1 'toy' 'flower' 'abc'
</code></pre>