Python: DictReader 返回字典列表?

-1 投票
3 回答
10001 浏览
提问于 2025-04-17 19:58

我正在使用csv.DictReader()来读取一个文件。它实际上返回的是一个字典的列表,而不是一个单独的字典。我该如何强制它返回一个单独的字典,或者我该如何合并它返回的字典列表呢?

def agilent_e8361c_pna_read(file_loc):
    '''
    Load the '.s2p' file in to a dictionary.
    '''

    with open(file_loc) as f:
        # define the fields in the Agilent '.s2p' file
        col_names = ["f","s11","arg_s11","s21","arg_s21","s12","arg_s12","s22","arg_s22"]

        # read the data into a dictionary
        s2p_dicts = csv.DictReader(itertools.ifilter(n_input.is_comment, f), fieldnames=col_names, delimiter=' ')

    return s2p_dict

理想情况下,这些数据应该一开始就读取到一个单独的字典中,而不需要合并。因为这些数据是一组相关的信息,列与列之间是有联系的,如果没有完整的数据集或者一个连贯的子集,这些数据就没有意义。如果DictReader无法做到这一点,我就只能选择合并这些字典列表了。我觉得科学家和程序员在处理数据集时,应该都希望能做到这一点,这并不是什么罕见的需求。

3 个回答

1

好的,这里有一个非常优雅的解决方案,适合任何遇到这个问题的人。

def agilent_e8361c_pna_read(file_loc):
    '''
    Load the '.s2p file in to a dictionary.
    '''

    with open(file_loc) as f:
        # read the data into a dictionary
        rows = csv.reader(itertools.ifilter(n_input.is_comment, f), delimiter=' ')

        # transpose data
        cols = transpose(rows)

        # create a dictionary with intuitive key names
        col_names = ["f","s11","arg_s11","s21","arg_s21","s12","arg_s12","s22","arg_s22"]
        s2p_dict = dict(zip(col_names,cols))

    return s2p_dict

def transpose(l):
    return map(list, zip(*l))
3

DictReader 是一个工具,它会把普通的 csv.reader() 返回的每一行数据,转换成一个字典。这个字典的结构是根据你提供的字段名,或者从第一行读取的字段名来决定的。这是它的设计初衷。

如果你的输入文件只有 行数据,你可以通过调用 next() 来获取这一行:

def agilent_e8361c_pna_read(file_loc):
    with open(file_loc) as f:
        col_names = ["f","s11","arg_s11","s21","arg_s21","s12","arg_s12","s22","arg_s22"]

        reader = csv.DictReader(itertools.ifilter(n_input.is_comment, f), fieldnames=col_names, delimiter=' ')
        return next(reader)

需要注意的是,next() 的调用应该放在 while 循环里,否则在你读取之前,文件就会被关闭了。

如果你想把多行数据合并成一个字典,你需要说明你希望数据是如何合并的。你可以很容易地把每个键对应的行合并成列表:

import csv

def agilent_e8361c_pna_read(file_loc):
    with open(file_loc) as f:
        col_names = ["f","s11","arg_s11","s21","arg_s21","s12","arg_s12","s22","arg_s22"]
        result = {k: [] for k in col_names}

        reader = csv.reader(itertools.ifilter(n_input.is_comment, f), fieldnames=col_names, delimiter=' ')
        for row in reader:
            for k, v in zip(col_names, row):
                result[k].append(v)

        return result

在这种情况下,我们就不需要 DictReader 了,因为我们并不是在为每一行构建一个字典。

4

如果你想要一个字典,格式是 键:值的列表,你可以这样做:

def transposeDict(listOfDicts):
    """Turn a list of dicts into a dict of lists.  Assumes all dicts in the list have the exact same keys."""

    keys = listOfDicts[0].iterkeys()
    return dict((key, [d[key] for d in listOfDicts]) for key in keys)

另外,如果你使用的是python2.7或更新的版本:

def transposeDict(listOfDicts):
    """Turn a list of dicts into a dict of lists.  Assumes all dicts in the list have the exact same keys."""

    keys = listOfDicts[0].iterkeys()
    return {key: [d[key] for d in listOfDicts] for key in keys}

当然,这里假设列表中的所有字典都有完全相同的键——这在使用DictReader时是这样的。

一般来说,如果不是这样的话,你需要做一些类似于:

from collections import defaultdict

def transposeListOfDicts(listOfDicts):
    """Turn a list of dict into a dict of lists"""

    result = defaultdict(list)

    for d in listofDicts:
        for key, value in d.iteritems():
            result[key].append(item)

    return result

如果你想为缺失的值设置占位符,那么可以这样写:

def transposeListOfDicts(listOfDicts):
    keys = {}
    for d in listOfDicts:
        keys.update(d.iterkeys())

    return {key: [d.get(key, None) for d in listOfDicts] for key in keys}

撰写回答