Python:将结构化CSV文件解析为di

2024-06-16 12:22:25 发布

您现在位置:Python中文网/ 问答频道 /正文

有没有办法将多行结构的csv解析为:

H3|509596|OUT|1653128|06/11/2018|
D1|1653128|1|390|MXT586|EA|EA|55.600|219.99|Product 1
D2|1653128|1|900|390|
T1|1653128|999|1000.000|
H3|509597|OUT|1653128|06/11/2018|
D1|1653128|1|390|MXT586|EA|EA|55.600|219.99|Product 2
D2|1653128|1|900|390|
D2|1653128|2|600|430|
T1|1653128|999|2164.000|
  • H3=头(文件中1-n次)
  • D1=行项目(1-999次)
  • D2=D1的行子项(0-999次)
  • T1=拖车(文件中1-n次)

我想阅读内容并解析到dict列表,如下所示:

List of Dict: 
 [ (
     Header : (509596, 'OUT', 1653128, '06/11/2018')
     Items  : [ (1653128, 1, 390, 'MXT586', 'EA', 'EA', ....,
                  (1, 900, 390) ) ]
     Trailer: (1653128, 999, 1000)
    ), ...
 ]

Tags: 文件csv项目productout结构h3d2
1条回答
网友
1楼 · 发布于 2024-06-16 12:22:25

Python的csv库可以通过指定|作为分隔符来读取该文件。删除任何空的尾随条目需要特别小心,因为有些行的末尾有一个|。你知道吗

import csv

def get_int(v):
    # Attempt to convert the value into an integer
    try:
        return int(v)
    except ValueError as e:
        return v    # Return the original value

filter_na = lambda row: tuple(get_int(v) for v in row[1:] if v)
data = []

with open('input.csv') as f_input:
    csv_input = csv.reader(f_input, delimiter='|')
    block = {}

    for row in csv_input:
        if row[0] == 'H3':
            block['Header'] = filter_na(row)
        elif row[0].startswith('D'):
            try:
                block['Items'].append(filter_na(row))
            except KeyError:
                block['Items'] = [filter_na(row)]
        elif row[0] == 'T1':
                block['Trailer'] = filter_na(row)
                data.append(block)

    print(data)

这将为您提供一个如下所示的词典列表:

[
    {
        'Header': (509597, 'OUT', 1653128, '06/11/2018'), 
        'Items': [(1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 1'), (1653128, 1, 900, 390), (1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 2'), (1653128, 1, 900, 390), (1653128, 2, 600, 430)], 
        'Trailer': (1653128, 999, '2164.000')
    }, 
    {
        'Header': (509597, 'OUT', 1653128, '06/11/2018'), 
        'Items': [(1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 1'), (1653128, 1, 900, 390), (1653128, 1, 390, 'MXT586', 'EA', 'EA', '55.600', '219.99', 'Product 2'), (1653128, 1, 900, 390), (1653128, 2, 600, 430)], 
        'Trailer': (1653128, 999, '2164.000')
    }
]

相关问题 更多 >