将txt解析为块

2024-06-16 10:45:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个txt文件,它的结构如下

start
id=1
date=21.05.2018
summ=500
end

start
id=7
date=23.05.2018
summ=500
owner=guest
end

我需要在字典列表中解析它(str:str(即使它是int类型或date:convert-it-to-string))。i、 e.用startend在块上拆分,然后在=符号上拆分。startend之间的行数可以不同。D级 但是一个男人却没有意识到这一点。我试过这样的方法:

d ={}
arr = []
ind = 0
for line in plines:
    ind = ind + 1
    if 'startpayment' in line:
        print('ind = ' + str(ind))
        for i in range(ind, len(plines)):
            print(i)
            key, value = plines[i].strip().split('=')
            if type(value) == 'str':
                d[key] = str(value)
            elif type(value) == 'int':
                 d[key] = int(value)
            arr.append(d)
            if 'endpayment' in line:
                break

有人能帮我吗?谢谢


Tags: keyiniddateifvaluelinestart
3条回答

使用Regex。你知道吗

import re

with open(filename, "r") as infile:
    data = infile.read()
    data = re.findall("(?<=\\bstart\\b).*?(?=\\bend\\b)", data, flags=re.DOTALL)   #Find the required data from text

r = []
for i in data:
    val =  filter(None, i.split("\n"))
    d = {}
    for j in val:
        s = j.split("=")    #Split by "=" to form key-value pair
        d[s[0]] = s[1]
    r.append(d)             #Append to list
print(r)

输出:

[{'date': '21.05.2018', 'summ': '500', 'id': '1'}, {'date': '23.05.2018', 'owner': 'guest', 'summ': '500', 'id': '7'}]

你也可以这样做:

from itertools import takewhile

with open('data.txt') as in_file:
    items = [line.strip() for line in in_file.read().split()]
    # ['start', 'id=1', 'date=21.05.2018', 'summ=500', 'end', 'start', 'id=7', 'date=23.05.2018', 'summ=500', 'owner=guest']

    pos = [i for i, item in enumerate(items) if item == 'start']
    # [0, 5]

    blocks = [list(takewhile(lambda x: x != 'end', items[i+1:])) for i in pos]
    # [['id=1', 'date=21.05.2018', 'summ=500'], ['id=7', 'date=23.05.2018', 'summ=500', 'owner=guest']]

    print([dict(x.split('=') for x in block) for block in blocks])

输出:

[{'id': '1', 'date': '21.05.2018', 'summ': '500'}, {'id': '7', 'date': '23.05.2018', 'summ': '500', 'owner': 'guest'}]

如果我答对了你的问题,我能想到的最简单的算法。你知道吗

d ={}
arr = []

for line in plines:
  if line == 'start':
    continue
  elif line =='end':
    arr.append(d)
    continue
  else:
    list_key_value = line.split('=')    
    d[list_key_value[0]] = int(list_key_value[1]) if 
    type(list_key_value[1]) == 'int' else str(list_key_value[1])
print (arr)

输出: [{'id': '7', 'date': '23.05.2018', 'summ': '500', 'owner': 'guest'}, {'id': '7', 'date': '23.05.2018', 'summ': '500', 'owner': 'guest'}]

相关问题 更多 >