在python中组合具有相同id的csv行值的方法

2024-04-25 19:14:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个CSV文件,其中包含事务列表

例如

Year    Name       Amount
2010    John       10
2011    John       10
2012    John       10
2011    John       10

我希望它按年份分组,然后按ID排序,ID具有预期的输出

Year    Name       Amount
2010    John       10
2011    John       20
2012    John       10

我当前的代码与此类似

with open('user.csv', 'r', errors='ignore') as csvFile:
     reader = csv.reader(csvFile)
     for row in sorted(reader):
            output.append([row[0], row[1], row[3]])
            print("Year", row[0], "  Name:" , row[1], "Amount:", row[3])

谢谢


Tags: 文件csvcsvfilenameid列表排序john
2条回答

在这个用例中,Pandas是一个很好的选择。但是如果您只需要使用内置模块。你知道吗

用途:

import csv
from collections import defaultdict

result = defaultdict(int)
with open('user.csv') as csvFile:
    reader = csv.reader(csvFile)     #Note delimiter is `,`
    header = next(reader)            #Get header
    for row in reader:
        result[(int(row[0]), row[1])] += int(row[2])  #key = row & name, value = amount

with open(filename_1, "w", newline='') as csvFile:
    writer = csv.writer(csvFile)
    writer.writerow(header) 
    for key, Amount in sorted(result.items(),  key=lambda x: x[0][0]):  #sorted by year
        writer.writerow([*key, Amount])

使用熊猫:

import pandas as pd
#Read csv
df = pd.read_csv("user.csv")

# Groupby and sum
df_new = df.groupby(["Year", "Name"]).agg({"Amount": "sum"}).sort_values(["Year", "Name"]).reset_index()

df_new

输出:

    Year    Name    Amount
0   2010    John    10
1   2011    John    20
2   2012    John    10

相关问题 更多 >