Python codepoints包_程序模块 - PyPI

将代码点序列转换为Unicode字符串或从Unicode字符串转换为Unicode字符串

codepoints的Python项目详细描述

Python的Unicode代码点

在python 3.3之前，python运行时可以用以下两种unicode模式之一编译：

sys.maxunicode == 0x10FFFF
在这种模式下，python的unicode字符串支持从u+0000到u+10ffff的所有unicode代码点。一个代码点由一个字符串元素表示：
```
>>> import sys
>>> hex(sys.maxunicode)
'0x10ffff'
>>> len(u'\U0001F40D')
1
>>> [c for c in u'\U0001F40D']
[u'\U0001f40d']
```
这是Linux上Python2.7的默认设置，也是所有操作系统上Python3.3和更高版本的默认设置。
sys.maxunicode == 0xFFFF
在这种模式下，python的unicode字符串只支持从u+0000到u+ffff的unicode代码点范围。从u+10000到u+10ffff的任何代码点都使用utf-16编码中的一对字符串元素表示：
```
>>> import sys
>>> hex(sys.maxunicode)
'0xffff'
>>> len(u'\U0001F40D')
2
>>> [c for c in u'\U0001F40D']
[u'\ud83d', u'\udc0d']
```
这是MacOS和Windows上Python2.7的默认设置。

这种运行时差异使得编写python模块来将unicode字符串作为一系列代码点进行操作非常不方便。

代码点模块

此模块通过公开api来将unicode字符串转换为代码点列表和从代码点列表转换为unicode字符串来解决问题，而不考虑sys.maxunicode：

>>> hex(sys.maxunicode)
'0xffff'
>>> snake = tuple(codepoints.from_unicode(u'\U0001F40D'))
>>> len(snake)
1
>>> snake[0]
128013
>> hex(snake[0])
'0x1f40d'
>>> codepoints.to_unicode(snake)
u'\U0001f40d'

欢迎加入QQ群-->： 979659372

codepoints 1.0

codepoints的Python项目详细描述

Python的Unicode代码点

代码点模块

推荐PyPI第三方库

nettools

scss-compile

jbs-utils

adobe-analytics-api-20

NREL-reVX

hipims

streamkit

pandasbt

akash-test5

folklore

torcharc

exoedge-linuxstats

mllp-http

SudokuSolver

simple-print

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

codepoints 1.0

codepoints的Python项目详细描述

Python的Unicode代码点

代码点模块

推荐PyPI第三方库

nettools

scss-compile

jbs-utils

adobe-analytics-api-20

NREL-reVX

hipims

streamkit

pandasbt

akash-test5

folklore

torcharc

exoedge-linuxstats

mllp-http

SudokuSolver

simple-print

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签