根据字符串值排除pandas行

0 投票
1 回答
2809 浏览
提问于 2025-04-17 20:47

我有一个pandas表格,其中有一列是字符串类型。我想要做的是把包含“Not found”这个字符串的行从数据框中排除掉。目前我尝试的是:

df[df.some_column != "Not found"],但这个方法不奏效。

期待大家的回复。

示例数据:

card_number effective_date  expiry_date grouping_name       Ac. Year code
0       1206090    28 Sep 2012  21 Aug 2013    Dummy no.1  201213
1       1206090    21 Feb 2013  21 Aug 2013   Dummy no.2   201213
2       1206090    28 Sep 2012  30 Nov 2012    Dummy no.3  201213
3       1206090    03 Dec 2012  21 Aug 2013    Dummy no.3  201213
4       1206090    23 Apr 2013  31 Aug 2013   Dummy no.4   201213
5       1206090    28 Sep 2012  21 Aug 2013    Dummy no.5  201213
6       1206090    28 Sep 2012  21 Aug 2013    Dummy no.6  201213
7       1206090    24 Oct 2012  07 Aug 2013     Not found  201213
8       1206090    08 Jan 2013  08 Jan 2013     Not found  201213
9       1206090    08 Jan 2013  31 Aug 2013     Not found  201213
10    Not found    03 Jul 2013  21 Aug 2013    Dummy no.1  201213
11    Not found    03 Jul 2013  21 Aug 2013   Dummy no.2   201213

额外说明:我的字符串匹配似乎很奇怪……当我运行df[grouping_name] != "Not found"时,7、8、9这些行返回的是true……有人知道这是为什么吗?

1 个回答

1

试试这个:

df[df['some_column'] != "Not found"]

这是带有示例数据的解决方案:

df = pd.read_csv("data.csv")
df

    card_number effective_date  expiry_date grouping_name   Ac. Year code
0    1206090     28 Sep 2012     21 Aug 2013     Dummy no.1  201213
1    1206090     21 Feb 2013     21 Aug 2013     Dummy no.2  201213
2    1206090     28 Sep 2012     30 Nov 2012     Dummy no.3  201213
3    1206090     03 Dec 2012     21 Aug 2013     Dummy no.3  201213
4    1206090     23 Apr 2013     31 Aug 2013     Dummy no.4  201213
5    1206090     28 Sep 2012     21 Aug 2013     Dummy no.5  201213
6    1206090     28 Sep 2012     21 Aug 2013     Dummy no.6  201213
7    1206090     24 Oct 2012     07 Aug 2013     Not found   201213
8    1206090     08 Jan 2013     08 Jan 2013     Not found   201213
9    1206090     08 Jan 2013     31 Aug 2013     Not found   201213
10   Not found   03 Jul 2013     21 Aug 2013     Dummy no.1  201213
11   Not found   03 Jul 2013     21 Aug 2013     Dummy no.2  201213


df[df['grouping_name'] != 'Not found']

card_number effective_date  expiry_date grouping_name   Ac. Year code
0    1206090     28 Sep 2012     21 Aug 2013     Dummy no.1  201213
1    1206090     21 Feb 2013     21 Aug 2013     Dummy no.2  201213
2    1206090     28 Sep 2012     30 Nov 2012     Dummy no.3  201213
3    1206090     03 Dec 2012     21 Aug 2013     Dummy no.3  201213
4    1206090     23 Apr 2013     31 Aug 2013     Dummy no.4  201213
5    1206090     28 Sep 2012     21 Aug 2013     Dummy no.5  201213
6    1206090     28 Sep 2012     21 Aug 2013     Dummy no.6  201213
10   Not found   03 Jul 2013     21 Aug 2013     Dummy no.1  201213
11   Not found   03 Jul 2013     21 Aug 2013     Dummy no.2  201213

撰写回答