如何将csv文件中的信息与Python进行比较?

2024-03-28 15:21:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在处理一个csv文件。 我有不同的列,每个列对应于我的数据集的一个信息。 假设我的文件每行包含:

  • 姓名信息1信息2信息3

-对于具有相同名称和信息1和2的行,我必须计算inf3的平均值

这是我停下来的代码:

col_a=[row[1] for row in file]
for i in col_a:
    currentrow=col_a[1]
    nextrow=col_a[2]
for i in range(0,len(col_a)):
    if (currentrow)==set(nextrow):???

我几个月前开始编程,请理解我的困难。你知道吗


Tags: 文件csv数据代码in名称信息for
1条回答
网友
1楼 · 发布于 2024-03-28 15:21:26

我仍然很难理解到底需要什么,但现在就来了。下面的脚本首先将第一列与之匹配的行组块在一起,例如3个“aaa”行。你知道吗

对于每个块,它定位具有匹配的感兴趣列的行。如果发现两个或更多,则计算平均值。你知道吗

import collections

file = [
    ["aaa", "3", "x", "g", "b", 4],
    ["aaa", "4", "e", "r", "t", 3],
    ["aaa", "3", "x", "g", "b", 7],
    ["vv1", "5", "w", "a", "s", 42],
    ["vv2", "5", "w", "a", "s", 10],
    ["vvv", "5", "w", "a", "s", 1],
    ["vvv", "5", "w", "a", "s", 4],
    ["vvv", "5", "w", "a", "s", 3]]

def calculate_stats(block):
    d = collections.defaultdict(list)
    # Build a dictionary of rows with matching columns
    for cols in block:  
        key = (cols[2], cols[3])        # columns to match e.g. "x" and "g"
        d[key].append(cols)

    for key, rows in d.items():
        if len(rows) > 1:
            col_f = [cols[5] for cols in rows]      # calculate mean on col f
            print "Matched: ", rows, "Mean: ", sum(col_f) / float(len(col_f))

last_row = file[0]
block = []

for cols in file:
    if cols[0] == last_row[0]:
        block.append(cols)
    elif len(block) > 1:
        calculate_stats(block)
        block = [cols]
    else:
        block = []
    last_row = cols

# Deal with any remainder

if len(block) > 1:
    calculate_stats(block)

对于我使用的示例数据,将显示以下结果:

Matched:  [['aaa', '3', 'x', 'g', 'b', 4], ['aaa', '3', 'x', 'g', 'b', 7]] Mean:  5.5
Matched:  [['vvv', '5', 'w', 'a', 's', 4], ['vvv', '5', 'w', 'a', 's', 3]] Mean:  3.5

相关问题 更多 >