Python demoji包_程序模块 - PyPI

准确地删除和替换文本字符串中的emojis。

demoji的Python项目详细描述

演示

准确地从文本块中查找或删除emojis。

基本用法

demoji需要从unicode联盟的emoji code repository下载初始数据。

第一次使用包时，请调用download_codes()：

>>>importdemoji>>>demoji.download_codes()Downloadingemojidata......OK(Gotresponsein0.14seconds)Writingemojidatato/Users/brad/.demoji/codes.json......OK

这将在~/.demoji/codes.json处存储Unicode十六进制符号，以备将来使用。

demoji导出两个与文本相关的函数findall()和replace()，它们分别表现为re模块的findall()和sub()。但是，findall()将emojis字典返回到它们的全名（描述）：

>>>tweet="""\... #startspreadingthenews yankees win great start by ?? going 5strong innings with 5k’s? ?... solo homerun ?? with 2 solo homeruns and? 3run homerun… ? ?? ??‍⚖️ with rbi’s … ??... ?? and ?? to close the game??!!!….... WHAT A GAME!!..... """>>>demoji.findall(tweet){"?":"fire","?":"volcano","??\u200d⚖️":"man judge: medium skin tone","??":"Santa Claus: medium-dark skin tone","??":"flag: Mexico","?":"ogre","?":"clown face","??":"flag: Nicaragua","??":"person rowing boat: medium-light skin tone","?":"ox",}

demoji需要下载而不是预先打包Unicode表情符号数据的原因是表情符号列表本身经常更新和更改。您可以通过经常调用demoji.download_codes()定期更新本地缓存。

要拉取上次下载的日期，可以使用last_downloaded_timestamp()帮助程序：

>>>demoji.last_downloaded_timestamp()datetime.datetime(2019,2,9,7,42,24,433776,tzinfo=<demoji.UTCobjectat0x101b9ecf8>)

如果以前没有下载过代码，结果将是None。

脚注：表情符号序列

许多看起来像单个unicode字符的emojis实际上是多字符序列。示例：

keycap 2实际上是3个字符，u+0032（ascii数字2）、u+fe0f（变量选择器）和u+20e3（组合封闭keycap2）。
苏格兰的标志7个组成字符，b'\\U0001f3f4\\U000e0067\\U000e0062\\U000e0073\\U000e0063\\U000e0074\\U000e007f'用全esaped表示法。

（您可以通过s.encode("unicode-escape")看到其中任何一个）

demoji要小心处理这个问题，应该找到完整的序列，而不是它们不完整的子组件。

它这样做的方式是按长度对emoji代码进行排序，然后编译一个连接的正则表达式，该表达式将首先贪婪地搜索较长的emoji，如果找不到则返回较短的emoji。这决不是一种超级优化的搜索方式，因为它具有o（n²）属性，但重点是准确性和完整性。

>>>frompprintimportpprint>>>seq="""\... I bet you didn't know that ?, ?‍♂️, and ?‍♀️ are three different emojis.... """>>>pprint(seq.encode('unicode-escape'))# Python 3(b"I bet you didn't know that \\U0001f64b, \\U0001f64b\\u200d\\u2642\\ufe0f,"b' and \\U0001f64b\\u200d\\u2640\\ufe0f are three different emojis.\\n')

欢迎加入QQ群-->： 979659372

demoji 0.1.5

demoji的Python项目详细描述

演示

基本用法

脚注：表情符号序列

推荐PyPI第三方库

shinkenplugins.plugins.drupal_extensions

django-nomad-country-blogs

yeecli

irisclient

python-digitalocean-backup

PyConfDict

bbcondeparser

ToscaWidgets

first-data-gatewa

CommandU

systemfixtures

cfn-macro-common

PyPi-SemanticVer

pypai

es-search-exporter

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

demoji 0.1.5

demoji的Python项目详细描述

演示

基本用法

脚注：表情符号序列

推荐PyPI第三方库

shinkenplugins.plugins.drupal_extensions

django-nomad-country-blogs

yeecli

irisclient

python-digitalocean-backup

PyConfDict

bbcondeparser

ToscaWidgets

first-data-gatewa

CommandU

systemfixtures

cfn-macro-common

PyPi-SemanticVer

pypai

es-search-exporter

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签