Python列匹配CSV

2024-04-26 00:19:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图读入一个csv文件:base_list.csv-csv文件中有两列

然后读入file_1.csv并从base_list.csv文件中删除和匹配值,然后 将这些内容写入名为Dups.csv的新csv文件

运行此命令时,出现以下错误:

emails = set(emails) #"set" removes duplicates in a list TypeError: unhashable type: 'list'

示例代码如下:

import csv
#gather emails from base_list:
with open("H:\\Python Backups\\DeDup\\ByCSV\\base_list.csv", "rU") as base_file:
    read_base_file = csv.reader(base_file, delimiter=",")
    duplicates_list = []
    rows = [row for row in read_base_file]
    for row in rows:
        duplicates_list.extend(row)
    #extract emails from other csv files (csv_files) from multiple
    #columns in those csv files (email_columns):
    emails = []
    with open("H:\\Python Backups\\DeDup\\ByCSV\\file_1.csv", "rU") as csvfile:
        read_csv = csv.reader(csvfile, delimiter=",")     
        email_rows = [r for r in read_csv]
        emails.extend(email_rows)
    #find duplicates from base_list and remove them:
    duplicates = [e for e in emails if e in duplicates_list]
    for dupe in duplicates:
        emails.remove(dupe)
    emails = set(emails) #"set" removes duplicates in a list
    #write the emails to a csv:
    writer = csv.writer(open("H:\\Python Backups\\DeDup\\ByCSV\\Dups.csv", "ab"))
    for email in zip(emails):
        writer.writerow(email)

Tags: 文件csvinfromforreadbaseemail
1条回答
网友
1楼 · 发布于 2024-04-26 00:19:44

出现的错误是因为您试图在集合中存储列表。这是不可能的,因为列表在python中是可变的,因此是不可修改的。你知道吗

>>> list_of_lists = [[1,2,3], ['a','b','c']]
>>> set(list_of_lists)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

email_rows更改为

email_rows = (tuple(r) for r in read_csv)

它将创建一个元组生成器列表,现在可以散列。你知道吗

相关问题 更多 >