将Unicode文本写入文本文件？

网友

1楼 · 编辑于 2024-04-26 05:44:04

通过在第一次获得unicode对象时将其解码为unicode对象，并在离开时根据需要对其进行编码，尽可能专门地处理unicode对象。

如果字符串实际上是unicode对象，则在将其写入文件之前，需要将其转换为unicode编码的字符串对象：

foo = u'Δ, Й, ק, ‎ م, ๗, あ, 叶, 葉, and 말.'
f = open('test', 'w')
f.write(foo.encode('utf8'))
f.close()

再次读取该文件时，将得到一个unicode编码的字符串，可以将其解码为unicode对象：

f = file('test', 'r')
print f.read().decode('utf8')

网友

2楼 · 编辑于 2024-04-26 05:44:04

在Python2.6+中，您可以use ^{}这是Python3上的默认值（builtin ^{}）：

import io

with io.open(filename, 'w', encoding=character_encoding) as file:
    file.write(unicode_text)

如果需要以增量方式编写文本（不需要多次调用unicode_text.encode(character_encoding)），这可能会更方便。与codecs模块不同，io模块具有适当的通用换行符支持。

网友

3楼 · 编辑于 2024-04-26 05:44:04

Unicode字符串处理在Python3中已经标准化。

只需以utf-8打开文件（32位Unicode到可变字节长度的utf-8转换将自动从内存转换到文件。）

out1 = "(嘉南大圳 ㄐㄧㄚ　ㄋㄢˊ　ㄉㄚˋ　ㄗㄨㄣˋ )"
fobj = open("t1.txt", "w", encoding="utf-8")
fobj.write(out1)
fobj.close()