在Python中重新格式化字符串

2 投票
2 回答
2898 浏览
提问于 2025-04-18 06:25

我想找到一个最好的方法,把从文本文件中读取的字符串格式化成最多给定的长度,而且不打断单词。我用过TextWrap这个函数,它在大多数情况下都能正常工作,但当读取的文本里有换行符,也就是包含段落时,它就不行了。TextWrapper函数不会保留这些换行符,这就成了一个问题。下面是我的代码:

f = open(inFile,'r') #read in text file
lines = f.read()
f.close()

paragraph = textwrap.wrap(lines, width=wid) #format paragraph


f = open(outFile, 'w') #write to file
for i in paragraph:
    print(i, file = f)
f.close()

我想到的一个办法是把格式化后的文本一行一行地打印到输出文件里,唯一的问题是我不知道怎么判断这一行是否是换行符?

任何建议都非常感谢。

更新:经过使用Ooga的建议后,换行符现在被正确保留了,但我遇到了最后一个问题,似乎在每一行放入的数据有点问题。看看输入、预期输出和实际输出,看看我想表达的意思。

输入:

log2(N) is about the expected number of probes in an average
successful search, and the worst case is log2(N), just one more
probe. If the list is empty, no probes at all are made. Thus binary
search is a logarithmic algorithm and executes in O(logN) time. In
most cases it is considerably faster than a linear search. It can
be implemented using iteration, or recursion. In some languages it
is more elegantly expressed recursively; however, in some C-based
languages tail recursion is not eliminated and the recursive
version requires more stack space.

预期输出:

log2(N) is about the expected number of
probes in an average successful search,
and the worst case is log2(N), just one
more probe. If the list is empty, no 
probes at all are made. Thus binary 
search is a logarithmic algorithm and 
executes in O(logN) time. In most cases
it is considerably faster than a linear 
search. It can be implemented using 
iteration, or recursion. In some 
languages it is more elegantly expressed
recursively; however, in some C-based
languages tail recursion is not
eliminated and the recursive version
requires more stack space.

实际输出:

log2(N) is about the expected number of
probes in an average
successful search, and the worst case is
log2(N), just one more
probe. If the list is empty, no probes
at all are made. Thus binary
search is a logarithmic algorithm and
executes in O(logN) time. In
most cases it is considerably faster
than a linear search. It can
be implemented using iteration, or
recursion. In some languages it
is more elegantly expressed recursively;
however, in some C-based
languages tail recursion is not
eliminated and the recursive
version requires more stack space.

只是想确认这实际上只有一个段落,因为新的换行符现在被保留了。我该如何让我的输出与预期输出匹配呢?

2 个回答

0

你可以一次读取文件的一行。

import textwrap

inFile = 'testIn.txt'
outFile = 'testOut.txt'
wid = 20

fin = open(inFile,'r')
fout = open(outFile, 'w')

for lineIn in fin:
  paragraph = textwrap.wrap(lineIn, width=wid)
  if paragraph:
    for lineOut in paragraph:
      print(lineOut, file=fout)
  else:
    print('', file=fout)

fout.close()
fin.close()
1

在编程中,有时候我们会遇到一些问题,想要找到解决方案。StackOverflow是一个很好的地方,大家可以在这里提问和回答问题。每个人都可以分享自己的经验和知识,帮助其他人解决问题。

当你在StackOverflow上看到一个问题时,通常会有很多人给出不同的答案。你可以根据这些答案来找到最适合你的解决方案。记得查看答案的投票数和评论,这样可以帮助你判断哪个答案更靠谱。

如果你有问题,也可以在这里提问。记得描述清楚你的问题,提供一些相关的代码或错误信息,这样别人才能更好地帮助你。

总之,StackOverflow是一个学习和交流的好地方,不管你是新手还是有经验的程序员,都能在这里找到有用的信息。

from textwrap import wrap

with open(inFile) as inf:
    lines = [line for para in inf for line in wrap(para, wid)]

with open(outFile, "w") as outf:
    outf.write("\n".join(lines))

撰写回答