使用python Popen和pandoc解析html时不需要的新行？

from subprocess import Popen, PIPE, STDOUT filedesc = open('myfile.tex','w') args = ['pandoc', '-f', 'html', '-t', 'latex'] p = Popen(args, stdout=PIPE, stdin=PIPE, stderr=STDOUT) outp, err = p.communicate(input=html) filedesc.write(outp)

1条回答

网友

1楼 · 发布于 2024-04-20 16:04:19

嗯，它似乎是python管道中的一种“bug”（？？）。在

我在Windows系统中执行此代码。这意味着当输入新行时，它们是CR+LF（\r\n）样式，而不是unix样式的（cleaner）LF（\n）新行。在

当我引入一个大的html文本由pandoc转换时，输出由管道返回到命令行。因此，每当达到标准列宽时，就会引入一个难看的“新行”字符。在我的情况下是CR+LF。这让我的输出看起来很奇怪。在

我实现的肮脏解决方案是在编写输出之前添加一个replace('\r\n','\n')，但我不确定它是否是最优雅的。在

from subprocess import Popen, PIPE, STDOUT

html = '<p><b>Some random html code</b> longer than 80 columns ... </p>'
filedesc = open('myfile.tex','w')
args = ['pandoc', '-f', 'html', '-t', 'latex']
p = Popen(args, stdout=PIPE, stdin=PIPE, stderr=STDOUT)
outp, err = p.communicate(input=html)
filedesc.write(outp.replace('\r\n','\n'))**strong text**

相关问题更多 >

编程相关推荐

热门问题

热门文章