Python将文本分割成x个字符的块

1 投票

3 回答

2964 浏览

提问于 2025-04-18 03:21

我正在使用这段代码来解析一个文本文件，并把每个句子放在新的一行：

import re

# open the file to be formatted 
filename=open('inputfile.txt','r')
f=filename.read()
filename.close()

# put every sentence in a new line 
pat = ('(?<!Dr)(?<!Esq)\. +(?=[A-Z])')
lines = re.sub(pat,'.\n',f)
print lines 

# write the formatted text 
# into a new txt file 
filename = open("outputfile.txt", "w")
filename.write(lines)
filename.close()

但其实我需要在每110个字符后分割句子。所以如果一行中的句子超过110个字符，就要把它分开，并在结尾加上'...'，然后在新的一行开始时再加上'...'，接着是被分开的句子的其余部分，依此类推。

有没有什么建议可以做到这一点？我有点迷茫。

文本处理字符串操作文本解析字符分割行格式化

3 个回答

在Python中，你不能在同一个文件里直接插入内容。像下面这样做可以实现你想要的效果。

警告：在操作之前，记得先备份一下文件，因为原来的文件会被替换掉。

from shutil import move
import os

insert=" #####blabla#### "
insert_position=110


targetfile="full/path/to/target_file"
tmpfile="/full/path/to/tmp.txt"

output=open(tmpfile,"w")

with open(targetfile,"r") as myfile:
    for line in myfile:
        if len(line) >insert_position:
            line=line[:insert_position+1] + insert + "\n" + line[insert_position+1:] 
            myfile.write

        output.write(line)  

output.close()

move(tmpfile,targetfile)

回答于 2025-04-18 由 Python大师

分享举报

我不知道“lines”里面具体是什么内容，但如果它不是一个包含每一行的列表，你需要把所有的行分开，放到一个列表里。

当你有了一个包含这些字符串（行）的列表后，你可以检查每个字符串里有多少个字符。如果字符数超过110个，就取前107个字符，然后在后面加上‘...’。就像这样：

for i in xrange(0, len(lines)):
    string_line = lines[i]
    if len(string_line) > 110:
        new_string = "{0}...".format(string_line[:107])
        lines[i] = new_string

解释一下：

如果你这样做：

string = "Hello"
print len(string)

结果会是：5

print string[:3]

结果会是："Hel"

回答于 2025-04-18 由 Python大师

分享举报

# open inputfile/read/auto-close 
with open('inputfile.txt') as f:
    lines = f.readlines() # with block auto closes file after block is executed

output = []

for line in lines:
    if len(line) > 110:
        while True: # until break
            output.append(line[:107] + '...') 
            if len(line[107:]) < 111: # if remainder of line is under 110 chars
                output.append('...' + line[107:])
                break
            line = line[107:] # otherwise loop continues with new line definition
    else:
        output.append(line)

# open outputfile/write/auto-closed
with open('outputfile.txt', 'w') as f:
    for line in output:
        f.write(line)

当然可以！请把你想要翻译的内容发给我，我会帮你把它变得简单易懂。

回答于 2025-04-18 由 Python大师

分享举报

Python将文本分割成x个字符的块

3 个回答

撰写回答