如何正确编制unicode d表格

# -*- coding: utf-8 -*- import binascii test_cases = [ 'aaaaa', # Normal bytestring 'ááááá', # Normal bytestring, but with extended ascii. Since the file is utf-8 encoded, this is utf-8 encoded 'ℕℤℚℝℂ', # Encoded unicode. The editor has encoded this, and it is defined as string, so it is left encoded by python u'aaaaa', # unicode object. The string itself is utf-8 encoded, as defined in the "coding" directive at the top of the file u'ááááá', # unicode object. The string itself is utf-8 encoded, as defined in the "coding" directive at the top of the file u'ℕℤℚℝℂ', # unicode object. The string itself is utf-8 encoded, as defined in the "coding" directive at the top of the file ] FORMAT = '%-20s -> %2d %-20s %-30s %-30s' for data in test_cases : try: hexlified = binascii.hexlify(data) except: hexlified = None print FORMAT % (data, len(data), type(data), hexlified, repr(data))

1条回答

网友

1楼 · 发布于 2024-04-18 13:47:10

当Python2.7看到'ℕℤℚℝℂ'时，它读到“这里有15个任意字节值”。它不知道它们代表什么字符，也不知道它们代表它们的编码。您需要将此字节字符串解码为unicode字符串，并指定编码，然后才能期望python能够计算字符数：

for data in test_cases :
    if isinstance(data, bytes):
        data = data.decode('utf-8')
    print FORMAT % (data, len(data), type(data), repr(data))

注意，在python3中，所有字符串文本都是默认的unicode对象

相关问题更多 >

编程相关推荐

热门问题

热门文章