Python：UTF16解码在Windows上添加了新空行

1 投票

1 回答

901 浏览

提问于 2025-04-15 23:05

我在Windows和*nix平台上遇到了额外换行的问题。

file = open('UTF16file.xml', 'rb')
html = file.read().decode('utf-16')
file.close()

regexp = re.compile(self.originalurl, re.S)
(html, changes) = regexp.subn(self.newurl, html)

file = open('UTF16file-regexed.xml', 'w+')
file.write(html.encode('utf-16'))
file.close()

在我的Mac上运行这段代码没问题——我得到的文件没有多余的换行。不过到目前为止我尝试过：

把正则表达式编码成utf-16，而不是解码文件——在Windows和OSX上都出错。
用'w+'模式写文件，而不是'wb'模式——在Windows上出错。

有什么想法吗？

正则表达式文件编码 Windows平台 utf-16 文件写入模式换行问题 osx平台

1 个回答

C:\Documents and Settings\Nick>python
ActivePython 2.6.4.10 (ActiveState Software Inc.) based on
Python 2.6.4 (r264:75706, Jan 22 2010, 16:41:54) [MSC v.1500 32 bit (Intel)]...
Type "help", "copyright", "credits" or "license" for more information.
>>> txt = """here
... is all
... my text n stuff."""
>>> f = open('u16.txt','wb')
>>> f.write(txt.encode('utf-16'))
>>> f.close()
>>> exit()

C:\Documents and Settings\Nick>notepad u16.txt

看起来像这样：

here is allmy text n stuff.

（不过当我从记事本复制粘贴到Firefox时，它实际上加了换行符）……但是这个：

C:\Documents and Settings\Nick>
    "C:\Program Files\Windows NT\Accessories\wordpad.exe" u16.txt

看起来像这样：

here 
is all
my text n stuff.

（在Windows XP SP3 32位系统上）

回答于 2025-04-15 由 Python大师

分享举报

Python：UTF16解码在Windows上添加了新空行

1 个回答

撰写回答