使用Python按列名更新CSV文件

3 投票
2 回答
3524 浏览
提问于 2025-04-20 10:49

我有一个这样的csv文件:

product_name, product_id, category_id
book, , 3
shoe, 3, 1
lemon, 2, 4

我想用python的csv库来更新每一行的product_id,方法是通过提供列名。

举个例子,如果我传入:

update_data = {"product_id": [1,2,3]}

那么csv文件应该变成:

product_name, product_id, category_id
book, 1, 3
shoe, 2, 1
lemon, 3, 4

2 个回答

0

(假设你在使用3.x版本)

Python有一个叫做CSV的模块,它是标准库的一部分,可以帮助你读取和修改CSV文件。

使用这个模块,我会先找到你想要的那一列的索引,然后把它存储到你创建的字典里。一旦找到了这个索引,接下来就是把列表中的每一项放到每一行里。

import csv

update_data = {"product_id": [None, [1,2,3]]}
#I've nested the original list inside another so that we can hold the column index in the first position.

line_no = 0 
#simple counter for the first step.

new_csv = [] 
#Holds the new rows for when we rewrite the file.

with open('test.csv', 'r') as csvfile:
    filereader = csv.reader(csvfile)

    for line in filereader:
        if line_no == 0:

            for key in update_data:
                update_data[key][0] = line.index(key) 
                #This finds us the columns index and stores it for us.

        else:

            for key in update_data:
                line[update_data[key][0]] = update_data[key][1].pop(0) 
                #using the column index we enter the new data into the correct place whilst removing it from the input list.

        new_csv.append(line)

        line_no +=1

with open('test.csv', 'w') as csvfile:
    filewriter = csv.writer(csvfile)

    for line in new_csv:
        filewriter.writerow(line)
1

你可以用你现有的 dictiter 来按顺序获取项目,比如:

import csv

update_data = {"product_id": [1,2,3]}
# Convert the values of your dict to be directly iterable so we can `next` them
to_update = {k: iter(v) for k, v in update_data.items()}

with open('input.csv', 'rb') as fin, open('output.csv', 'wb') as fout:
    # create in/out csv readers, skip intial space so it matches the update dict
    # and write the header out
    csvin = csv.DictReader(fin, skipinitialspace=True)
    csvout = csv.DictWriter(fout, csvin.fieldnames)
    csvout.writeheader()
    for row in csvin:
        # Update rows - if we have something left and it's in the update dictionary,
        # use that value, otherwise we use the value that's already in the column.
        row.update({k: next(to_update[k], row[k]) for k in row if k in to_update})
        csvout.writerow(row)

现在,这里假设每个新的列值会放到对应的行号上,而现有的值会在之后使用。你也可以改变这个逻辑,比如只在现有值为空的时候才使用新值(或者根据你想要的其他标准)。

撰写回答