python中特定行后拆分文本文件

with open('fort.16', 'r') as infile, open('output_fort.16', 'w') as outfile: copy= False for line in infile: if line.strip() == '# legend': copy = True continue elif line.strip()=='End': copy = False elif copy: outfile.write(line)

2条回答

网友

1楼 · 编辑于 2024-04-20 04:31:18

fp = open("random.txt")

data = []
temp = []

for i, line in enumerate(fp):
    if line.strip() == "END":
        new_file = open("file"+str(i)+".txt", "a+")
        for i in temp:
            new_file.write(i+"\n")
        temp = []
        new_file.close()
        continue
    temp.append(line.strip())

fp.close()
print(data)

这是一个，每次都创建一个新文件。文件名是file，并且是找到“END”行的位置的索引。：）

网友

2楼 · 编辑于 2024-04-20 04:31:18

我用嵌套生成器解决了这个问题：

import re

SECTION_START = re.compile(r'^\s*theta\s+sigma\s*$')
SECTION_END = re.compile(r'^\s*END\s*$')

def fresco_iter(stream):
    def inner(stream):
        # Yields each line until an end marker is found (or EOF)
        for line in stream:
            if line and not SECTION_END.match(line):
                yield line
                continue
            break

    # Find a start marker, then break off into a nested iterator
    for line in stream:
        if line:
            if SECTION_START.match(line):
                yield inner(stream)
            continue
        break

fresco_iter方法返回一个可以循环的生成器。它为theta sigma对的每个部分返回1个生成器

>>> with open('fort.16', 'r') as fh:
...     print(list(fresco_iter(fh)))
[<generator object fresco_iter.<locals>.inner at 0x7fbc6da15678>,
 <generator object fresco_iter.<locals>.inner at 0x7fbc6da15570>]

因此，为了利用这一点，您可以创建自己的嵌套循环来处理嵌套的生成器

filename = 'fort.16'

with open(filename, 'r') as fh:
    for nested_iter in fresco_iter(fh):
        print(' - start')
        for line in nested_iter:
            print(line.rstrip())
        print(' - end')

将输出

 - start
1        0.1
2        0.1
3        0.2
 - end
 - start
1        0.3
2        0.2
 - end

这种策略一次只能在内存中保存一行输入文件，因此适用于任何大小的文件，即使在最小的设备上。。。因为发电机很棒

所以一路走下去。。。将输出分离为单个文件：

with open(filename, 'r') as fh_in:
    for (i, nested_iter) in enumerate(fresco_iter(fh_in)):
        with open('{}.part-{:04d}'.format(filename, i), 'w') as fh_out:
            for line in nested_iter:
                fh_out.write(line)

将输出刚好数字以分隔名为fort.16.part-0000和fort.16.part-0001的文件

我希望这有帮助，快乐编码

相关问题更多 >

编程相关推荐

热门问题

热门文章