Python:如何将一个.txt文件拆分为两个或多个文件，每个文件中的行数相同？

# Does not loud into memory with open('bigdata.txt', 'r') as r: with open('fhalf', 'w') as f: for line in r: if line == 'pattern\n': # Splits the file when there is an occurence of the pattern. #But the occurence as you may notice won't be included in either the two files which is not a good thing since I need all the data. break f.write(line) with open('shalf.txt', 'w') as f: for line in r: f.write(line)

1条回答

网友

1楼 · 发布于 2024-06-10 09:25:46

用.readlines()将所有行读入一个列表，然后计算每个文件需要分配多少行，然后开始写入！你知道吗

num_files = 2
with open('bigdata.txt') as in_file:
    lines = in_file.readlines()
    lines_per_file = len(lines) // num_files
    for n in range(num_files):
        with open('file{}.txt'.format(n+1), 'w') as out_file:
            for i in range(n * lines_per_file, (n+1) * lines_per_file):
                out_file.write(lines[i])

以及全面测试：

$ cat bigdata.txt 
line1
line2
line3
line4
line5
line6
$ python -q
>>> num_files = 2
>>> with open('bigdata.txt') as in_file:
...     lines = in_file.readlines()
...     lines_per_file = len(lines) // num_files
...     for n in range(num_files):
...         with open('file{}.txt'.format(n+1), 'w') as out_file:
...             for i in range(n * lines_per_file, (n+1) * lines_per_file):
...                 out_file.write(lines[i])
... 
>>> 
$ more file*
::::::::::::::
file1.txt
::::::::::::::
line1
line2
line3
::::::::::::::
file2.txt
::::::::::::::
line4
line5
line6

如果无法将bigdata.txt读入内存，那么.readlines()解决方案将无法将其剪切。你知道吗

你必须边读边写，这没什么大不了的。你知道吗

至于计算长度，首先，this question讨论了一些方法，我最喜欢的是凯尔的sum()方法。你知道吗

num_files = 2
num_lines = sum(1 for line in open('bigdata.txt'))
lines_per_file = num_lines // num_files
with open('bigdata.txt') as in_file:
    for n in range(num_files):
        with open('file{}.txt'.format(n+1), 'w') as out_file:
            for _ in range(lines_per_file):
                out_file.write(in_file.readline())

相关问题更多 >

编程相关推荐

热门问题

热门文章