Python- 将单个文件拆分为每个部分的独立文件

2024-04-24 22:10:33 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含5段数据的.txt文件。每个部分都有一个标题行“section X”。我想从这个文件中分析并编写5个独立的文件。该节将从标题开始,在下一节标题之前结束。下面的代码创建了5个单独的文件;但是,它们都是空的。你知道吗

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2",
    "Section 3", "Section 4", "Section 5"]

with open(filename+".txt", "rb") as oldfile:
    for i in dimensionsList:
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        with open(i+".txt", "w") as newfile: 
            for line in oldfile:
                if line.strip() == i:
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    break
                newfile.write(line)

Tags: 文件intxt标题foraswithline
1条回答
网友
1楼 · 发布于 2024-04-24 22:10:33

问题

在测试代码时,它只对第1节有效(其他部分对我来说也是空白)。我意识到问题在于节之间的转换(还有,licycle在所有的iteraction上重新启动)。你知道吗

第2节在第二个forif line.strip() == nextelem)处读取。下一行是第2节的数据(而不是文本Section 2)。你知道吗

这很难用文字表达,但请测试以下代码:

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:
    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    for i in dimensionsList:
        print(nextelem)
        with open(i + ".txt", "w") as newfile:
            for line in oldfile:
                print("ignoring %s" % (line.strip()))
                if line.strip() == i:
                    nextelem = licycle.next()
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    # nextelem = licycle.next()
                    print("ignoring %s" % (line.strip()))
                    break
                print("printing %s" % (line.strip()))
                newfile.write(line)
            print('')

它将打印:

Section 1
ignoring Section 1
printing aaaa
printing bbbb
ignoring Section 2

Section 2
ignoring ccc
ignoring ddd
ignoring Section 3
ignoring eee
ignoring fff
ignoring Section 4
ignoring ggg
ignoring hhh
ignoring Section 5
ignoring iii
ignoring jjj

Section 2

Section 2

Section 2

它为第1节工作,它检测第2节,但它一直忽略行,因为它找不到“第2节”。你知道吗

如果每次重新启动这些行(总是从第1行开始),我认为程序会工作。但是我做了一个简单的代码,应该对你有用。你知道吗

解决方案

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:

    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    newfile = None
    line = oldfile.readline()

    while line:

        # Case 1: Found new section
        if line.strip() == nextelem:
            if newfile is not None:
                newfile.close()
            nextelem = licycle.next()
            newfile = open(line.strip() + '.txt', 'w')

        # Case 2: Print line to current section
        elif newfile is not None:
            newfile.write(line)

        line = oldfile.readline()

如果找到该节,它将开始在这个新文件中写入。否则,继续在当前文件中写入。你知道吗

Ps.:下面是我使用的文件示例:

Section 1
aaaa
bbbb
Section 2
ccc
ddd
Section 3
eee
fff
Section 4
ggg
hhh
Section 5
iii
jjj

相关问题 更多 >