如何根据不同列中的条件删除相同值的Pandas dataframe行问题的回答

如何根据不同列中的条件删除相同值的Pandas dataframe行

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html" rel="nofollow">^{<cd1>}</a>有一个<code>keep</code>参数，将其设置为<code>last</code>： <pre><code>In [188]: df.drop_duplicates(subset=['value'], keep='last') Out[188]: id name value 0 345 name1 456 1 12 name2 220 5 2 name6 567 </code></pre> 实际上，我认为以下是你想要的： ^{pr2}$ 在这里，我们将删除具有重复值且“id”长度不是1的行标签细分： <pre><code>In [198]: df['value'].duplicated() Out[198]: 0 False 1 False 2 False 3 True 4 True 5 True Name: value, dtype: bool In [199]: df.loc[df['value'].duplicated(), 'value'] Out[199]: 3 567 4 567 5 567 Name: value, dtype: int64 In [200]: df['value'].isin(df.loc[df['value'].duplicated(), 'value'].unique()) Out[200]: 0 False 1 False 2 True 3 True 4 True 5 True Name: value, dtype: bool In [201]: (df['value'].isin(df.loc[df['value'].duplicated(), 'value'].unique())) & (df['id'].astype(str).str.len() != 1) Out[201]: 0 False 1 False 2 True 3 True 4 True 5 False dtype: bool In [202]: df.index[(df['value'].isin(df.loc[df['value'].duplicated(), 'value'].unique())) & (df['id'].astype(str).str.len() != 1)] Out[202]: Int64Index([2, 3, 4], dtype='int64') </code></pre> 所以上面使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.duplicated.html#pandas.Series.duplicated" rel="nofollow">^{<cd4>}</a>返回重复值，<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.unique.html" rel="nofollow">^{<cd5>}</a>只返回唯一的重复值，<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.isin.html" rel="nofollow">^{<cd6>}</a>为了测试成员资格，我们将'id'列转换为<code>str</code>，这样我们可以使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.len.html" rel="nofollow">^{<cd8>}</a>测试长度，并使用布尔掩码来屏蔽索引标签。在

如何根据不同列中的条件删除相同值的Pandas dataframe行

1 个回答

相关Python问题