操作转换的python实现。
python-ottype的Python项目详细描述
Python窝
用于操作转换(OT)的python库。基本思想遵循https://github.com/ottypes/docs的规范。在
安装
pip install ottype
OT操作
跳过
对象类型:int
从当前位置跳过n
字符表示为n
插入
对象类型:str
在当前位置插入一个字符串s
表示为s
assertapply('asdf',['qwer'])=='qwerasdf'assertapply('asdf',[2,'qwer'])=='asqwerdf'
删除
对象类型:dict
删除当前位置的字符串s
表示为{'d': s}
assertapply('asdf',[{'d':'as'}])=='df'assertapply('asdf',[1,{'d':'sd'}])=='af'
支持的功能
OT=Union[int,str,dict]
check(ots: List[OT], *, check_unoptimized: bool = True) -> bool
检查列表是否只包含有效的OTs。如果check_unoptimized
是{
assertcheck(['a',4,'b'])assertnotcheck(['a','b'])# is not normalizedassertnotcheck([3])# is not normalized
apply(doc: str, ots: List[OT]) -> str
将OTs列表应用于字符串。在
assertapply('abcde',[2,'qq',{'d':'c'},1,'w'])=='abqqdwe'
inverse_apply(doc: str, ots: List[OT]) -> str
将OTs列表反向应用于字符串。在
assertinverse_apply(apply(doc,ots),ots)==doc
normalize(ots: List[OT]) -> List[OT]
规范化OTs列表:合并连续的OTs并修剪最后的跳过操作。在
assertnormalize([1,2,'as','df',{'d':'qw'},{'d':'er'},3]) \ ==[3,'asdf',{'d':'qwer'}]
transform(ots1: List[OT], ots2: List[OT]) -> List[OT]
使用以下属性转换OTs列表:
assertapply(apply(doc,ots1),transform(ots2,ots1,'left')) \ ==apply(apply(doc,ots2),transform(ots1,ots2,'right'))
compose(ots1: List[OT], ots2: List[OT]) -> List[OT]
使用属性组合两个OTs列表:
assertapply(apply(doc,ots1),ots2)==apply(doc,compose(ots1,ots2))
基准测试(python3.7)
apply
函数
=== baseline (extra.old_ottype) ===
Doc Length : 100, OT Length : 5, Performance : 5.80 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 10, Performance : 8.87 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 20, Performance : 16.47 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 50, Performance : 39.13 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 100, Performance : 73.84 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 5, Performance : 4.70 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 10, Performance : 7.46 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 20, Performance : 14.55 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 50, Performance : 50.22 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 100, Performance : 83.51 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 5, Performance : 7.43 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 10, Performance : 13.05 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 20, Performance : 18.66 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 50, Performance : 41.28 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 100, Performance : 97.67 ms/loop ( 1.00x )
=== python ===
Doc Length : 100, OT Length : 5, Performance : 6.77 ms/loop ( 0.86x )
Doc Length : 100, OT Length : 10, Performance : 15.48 ms/loop ( 0.57x )
Doc Length : 100, OT Length : 20, Performance : 27.14 ms/loop ( 0.61x )
Doc Length : 100, OT Length : 50, Performance : 54.14 ms/loop ( 0.72x )
Doc Length : 100, OT Length : 100, Performance : 84.28 ms/loop ( 0.88x )
Doc Length : 1000, OT Length : 5, Performance : 4.23 ms/loop ( 1.11x )
Doc Length : 1000, OT Length : 10, Performance : 8.74 ms/loop ( 0.85x )
Doc Length : 1000, OT Length : 20, Performance : 18.65 ms/loop ( 0.78x )
Doc Length : 1000, OT Length : 50, Performance : 37.61 ms/loop ( 1.34x )
Doc Length : 1000, OT Length : 100, Performance : 82.60 ms/loop ( 1.01x )
Doc Length : 10000, OT Length : 5, Performance : 8.86 ms/loop ( 0.84x )
Doc Length : 10000, OT Length : 10, Performance : 13.19 ms/loop ( 0.99x )
Doc Length : 10000, OT Length : 20, Performance : 20.43 ms/loop ( 0.91x )
Doc Length : 10000, OT Length : 50, Performance : 48.91 ms/loop ( 0.84x )
Doc Length : 10000, OT Length : 100, Performance : 102.81 ms/loop ( 0.95x )
=== cython ===
Doc Length : 100, OT Length : 5, Performance : 0.77 ms/loop ( 7.55x )
Doc Length : 100, OT Length : 10, Performance : 1.36 ms/loop ( 6.53x )
Doc Length : 100, OT Length : 20, Performance : 2.34 ms/loop ( 7.04x )
Doc Length : 100, OT Length : 50, Performance : 4.74 ms/loop ( 8.25x )
Doc Length : 100, OT Length : 100, Performance : 9.73 ms/loop ( 7.59x )
Doc Length : 1000, OT Length : 5, Performance : 0.70 ms/loop ( 6.75x )
Doc Length : 1000, OT Length : 10, Performance : 1.61 ms/loop ( 4.64x )
Doc Length : 1000, OT Length : 20, Performance : 2.47 ms/loop ( 5.88x )
Doc Length : 1000, OT Length : 50, Performance : 5.52 ms/loop ( 9.10x )
Doc Length : 1000, OT Length : 100, Performance : 9.02 ms/loop ( 9.26x )
Doc Length : 10000, OT Length : 5, Performance : 2.20 ms/loop ( 3.38x )
Doc Length : 10000, OT Length : 10, Performance : 2.57 ms/loop ( 5.07x )
Doc Length : 10000, OT Length : 20, Performance : 2.95 ms/loop ( 6.33x )
Doc Length : 10000, OT Length : 50, Performance : 5.97 ms/loop ( 6.92x )
Doc Length : 10000, OT Length : 100, Performance : 10.92 ms/loop ( 8.94x )
inverse_apply
函数
=== baseline (extra.old_ottype) ===
Doc Length : 100, OT Length : 5, Performance : 8.26 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 10, Performance : 15.00 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 20, Performance : 27.50 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 50, Performance : 56.86 ms/loop ( 1.00x )
Doc Length : 100, OT Length : 100, Performance : 129.10 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 5, Performance : 6.35 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 10, Performance : 16.14 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 20, Performance : 22.65 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 50, Performance : 67.26 ms/loop ( 1.00x )
Doc Length : 1000, OT Length : 100, Performance : 113.74 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 5, Performance : 8.79 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 10, Performance : 10.16 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 20, Performance : 22.46 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 50, Performance : 54.26 ms/loop ( 1.00x )
Doc Length : 10000, OT Length : 100, Performance : 129.20 ms/loop ( 1.00x )
=== python ===
Doc Length : 100, OT Length : 5, Performance : 6.05 ms/loop ( 1.37x )
Doc Length : 100, OT Length : 10, Performance : 11.30 ms/loop ( 1.33x )
Doc Length : 100, OT Length : 20, Performance : 21.16 ms/loop ( 1.30x )
Doc Length : 100, OT Length : 50, Performance : 44.43 ms/loop ( 1.28x )
Doc Length : 100, OT Length : 100, Performance : 101.05 ms/loop ( 1.28x )
Doc Length : 1000, OT Length : 5, Performance : 10.06 ms/loop ( 0.63x )
Doc Length : 1000, OT Length : 10, Performance : 12.38 ms/loop ( 1.30x )
Doc Length : 1000, OT Length : 20, Performance : 24.55 ms/loop ( 0.92x )
Doc Length : 1000, OT Length : 50, Performance : 42.64 ms/loop ( 1.58x )
Doc Length : 1000, OT Length : 100, Performance : 96.43 ms/loop ( 1.18x )
Doc Length : 10000, OT Length : 5, Performance : 9.42 ms/loop ( 0.93x )
Doc Length : 10000, OT Length : 10, Performance : 11.89 ms/loop ( 0.85x )
Doc Length : 10000, OT Length : 20, Performance : 25.74 ms/loop ( 0.87x )
Doc Length : 10000, OT Length : 50, Performance : 58.58 ms/loop ( 0.93x )
Doc Length : 10000, OT Length : 100, Performance : 97.37 ms/loop ( 1.33x )
=== cython ===
Doc Length : 100, OT Length : 5, Performance : 1.12 ms/loop ( 7.37x )
Doc Length : 100, OT Length : 10, Performance : 1.69 ms/loop ( 8.90x )
Doc Length : 100, OT Length : 20, Performance : 2.80 ms/loop ( 9.82x )
Doc Length : 100, OT Length : 50, Performance : 5.49 ms/loop ( 10.35x )
Doc Length : 100, OT Length : 100, Performance : 12.22 ms/loop ( 10.56x )
Doc Length : 1000, OT Length : 5, Performance : 1.36 ms/loop ( 4.68x )
Doc Length : 1000, OT Length : 10, Performance : 2.25 ms/loop ( 7.19x )
Doc Length : 1000, OT Length : 20, Performance : 2.98 ms/loop ( 7.61x )
Doc Length : 1000, OT Length : 50, Performance : 6.26 ms/loop ( 10.75x )
Doc Length : 1000, OT Length : 100, Performance : 11.14 ms/loop ( 10.21x )
Doc Length : 10000, OT Length : 5, Performance : 2.35 ms/loop ( 3.75x )
Doc Length : 10000, OT Length : 10, Performance : 3.62 ms/loop ( 2.81x )
Doc Length : 10000, OT Length : 20, Performance : 4.50 ms/loop ( 4.99x )
Doc Length : 10000, OT Length : 50, Performance : 7.07 ms/loop ( 7.68x )
Doc Length : 10000, OT Length : 100, Performance : 14.20 ms/loop ( 9.10x )
- 项目
标签: