如何用Python连接从文件读取的字符串?

2 投票
3 回答
10335 浏览
提问于 2025-04-16 04:33

Emacs的自动换行模式会把行分开,让文档看起来更整齐。我需要把从文档中读取的字符串连接起来。

比如说,(CR是回车符,不是真正的字符)

  - Blah, Blah, and (CR)
    Blah, Blah, Blah, (CR)
    Blah, Blah (CR)
  - A, B, C (CR) 
    Blah, Blah, Blah, (CR)
    Blah, Blah (CR)

这些内容通过readlines()函数读入字符串缓冲区数组,生成

["Blah, Blah, and Blah, Blah, Blah, Blah, Blah", "A, B, C Blah, Blah, Blah, Blah, Blah"]

我考虑过用循环来检查'-',把之前存储的所有字符串连接起来,但我觉得Python应该有更有效的方法来做到这一点。

补充:

根据kindall的代码,我可以得到我想要的结果,如下所示。

lines = ["- We shift our gears toward nextGen effort"," contribute the work with nextGen."]
out = [(" " if line.startswith(" ") else "\n") + line.strip() for line in lines]
print out
res = ''.join(out).split('\n')[1:]
print res

结果如下。

['\n- We shift our gears toward nextGen effort', ' contribute the work with nextGen.']
['- We shift our gears toward nextGen effort contribute the work with nextGen.']

3 个回答

0

可以使用 file.readlines()。这个方法会返回一个字符串列表,每个字符串代表文件中的一行:

readlines(...)
    readlines([size]) -> list of strings, each a line from the file.

    Call readline() repeatedly and return a list of the lines so read.
    The optional size argument, if given, is an approximate bound on the
    total number of bytes in the lines returned.

补充说明:有评论提到,readlines() 其实不是最好的选择。可以忽略这个建议,改用下面的方法。

如果你想把 emacs 输出的内容作为输入传给一个 Python 函数,我会给你这个(假设 emacs 的输出是一个很长的字符串):

[s.replace("\n", "") for s in emacsOutput.split('-')]

希望这对你有帮助

3

我不太确定你是想要这个:

result = thefile.read()  

还是说你想要这个:

result = ''.join(line.strip() for line in thefile)  

或者其他什么东西……

4

根据我的理解,你的问题是要取消硬换行,把每一组缩进的行恢复成一行软换行的格式。这是一种解决方法:

# hard-coded input, could also readlines() from a file
lines = ["- Blah, Blah, and", 
         "  Blah, Blah, Blah,",
         "  Blah, Blah",
         "- Blah, Blah, and",
         "  Blah, Blah, Blah,",
         "  Blah, Blah"]

out = [(" " if line.startswith(" ") else "\n") + line.strip() for line in lines]
out = ''.join(out)[1:].split('\n')

print out

撰写回答