在从CSV文件读取的多个列表中查找重复（Python）

2条回答

网友

1楼 · 编辑于 2024-05-14 16:52:42

我会这样做的：

>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> rows = [['Bob', 'Jane', 'Joe'],
... ['Megan', 'Tom', 'Jane'],
... ['Jane', 'Joe', 'Rob']]
...
>>> for row in rows:
...     for name in row:
...         d[name] += 1
... 
>>> filter(lambda x: x[1] >= 3, d.iteritems())
[('Jane', 3)]

它使用默认值为0的dict来计算每个名称在文件中出现的次数，然后根据条件（count>；=3）过滤dict。在

网友

2楼 · 编辑于 2024-05-14 16:52:42

把它放在一起csv.reader用法）：

import csv
import collections
d = collections.defaultdict(int)
with open("names.csv", "rb") as f: # Python 3.x: use newline="" instead of "rb"
    reader = csv.reader(f):
    reader.next() # ignore useless heading row
    for row in reader:
        for name in row:
            name = name.strip()
            if name:
                d[name] += 1
 morethan3 = [(name, count) for name, count in d.iteritems() if count >= 3]
 morethan3.sort(key=lambda x: x[1], reverse=True)
 for name, count in morethan3:
    print name, count

更新回应评论：

无论您是否使用DictReader方法，您都需要通读整个CSV文件。如果您想忽略“name2”列（not row），则忽略它。您不需要按照变量名“rows”的建议保存所有数据。下面是一个更通用的方法的代码，它不依赖列标题的特定顺序，并且允许选择/拒绝特定的列。在

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

在从CSV文件读取的多个列表中查找重复（Python）

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >