Python regex4dummies包_程序模块 - PyPI

简化字符串模式查找的nlp库

regex4dummies的Python项目详细描述

https://travis-ci.org/DarkmatterVale/regex4dummies.svg?branch=master

字符串中的简单模式查找和自然语言处理。查看regex4dummies网站https://darkmattervale.github.io/regex4dummies/

功能

自动模式检测（语义和文字）
多个解析器（nltk、pattern和nlpnet的实现）
关键字搜索以在文本中查找特定短语
主题分析和重要信息提取
标记赋予器和句子依赖标识符
短语提取（名词、动词、介词）
字符串比较

路线图

我计划在未来实现的一些功能：

机器学习。这将允许解析器学习多种语法“样式”，并能够成功地解析更广泛的字符串选择
其他分析器

如果您有功能请求，可以随意向github问题跟踪器添加问题。感谢所有的贡献和要求！

用法

regex4dummies非常容易使用。只需导入库，获取一些字符串，并进行比较！

fromregex4dummiesimportregex4dummiesfromregex4dummiesimportToolkit# Creating stringsstrings=["This is the first test string.","This is the second test string."]regex=regex4dummies()# Identifying literal patterns in stringsprintregex.compare_strings(parser='default',pattern_detection="literal",text=strings)# Identifying semantic patterns in strings using the nltk parserprintregex.compare_strings(parser='nltk',pattern_detection="semantic",text=strings)

上面是最简单形式的regex4dummies。它还允许其他功能，包括：

# Display the version of regex4dummies you are usingprintregex.__version__# To use the other parsers, replace the above line of code with either of the following:# print regex.compare_strings( parser='pattern', pattern_detection="semantic", text=strings )# print regex.compare_strings( parser='nlpnet', pattern_detection="semantic", text=strings )# To call all of the parsers, replace the above line of code with the following:# print regex.compare_strings( parser='', pattern_detection="semantic", text=strings )# To get the topics of the strings, call the get_topics functionprintregex.get_topics(text=strings)# Printing pattern informationpattern_information=regex.get_sentence_information()forobjectsinpattern_information:print"[ Pattern ]             : "+objects.patternprint"[ Subject ]             : "+objects.subjectprint"[ Verb ]                : "+objects.verbprint"[ Object ]              : "+objects.object[0]print"[ Prep Phrases ]        : "+str(objects.prepositional_phrases)print"[ Reliability Score ]   : "+str(objects.reliability_score)print"[ Applicability Score ] : "+str(objects.applicability_score)print""

新发布的一组特性包括标记器和依赖项查找器函数。要使用它们，只需给出一个字符串作为第一个参数，以及要用作这两个函数的第二个参数的解析器的名称。

# Testing the toolkit functionstool_tester=Toolkit()# Testing the tokenizer functionsprinttool_tester.tokenize(text="This is a test string.",parser="")# Testing the dependency functionsprinttool_tester.find_dependencies(text="This is a test string.",parser="pattern")

其他包括的功能如下所示。

# Testing the information extraction functionsregex.extract_important_information(text=["This is a test string."])# Testing the ability to extract phrasesprint"Noun Phrases: "+str(tool_tester.extract_noun_phrases(text="This is a test string."))print"Verb Phrases(Pattern): "+str(tool_tester.extract_verb_phrases(text="This is a test string.",parser="pattern"))print"Verb Phrases(Nlpnet): "+str(tool_tester.extract_verb_phrases(text="This is a test string.",parser="nlpnet"))print"Prepositional Phrases: "+str(tool_tester.extract_prepositional_phrases(text="This is a test string in the house."))print"String comparison: "+str(tool_tester.compare_strings(String1="This is a test string.",String2="This is a test string."))

安装

要安装此库，请使用pip。

$ pip install regex4dummies

除了库之外，wget是使用nlpnet解析器所必需的命令行命令。如果您没有wget或无法获取它，请按照以下说明继续获取nlpnet解析器的功能。

安装nlpnet所需依赖项的说明：

在github中找到的最新版本上下载nlpnet_依赖项文件（请不要，当未压缩时，此文件的大小超过350 MB）。
将这个目录放在nltk数据所在的同一个目录中（如果没有安装，只需运行库并通过gui下载程序）

就这样！现在应该可以使用nlpnet解析器了。

修补程序注释

v1.4.6：代码重构，下载系统重做

使代码达到PEP8标准
重拨下载系统。不再使用非功能性的gui；它使用自动安装。此外，依赖项的过大大小已从约1.5 GB减少到约700 MB

贡献

欢迎投稿人，非常需要！Regex4Dummies仍在大力开发中，需要它能得到的所有帮助。如果您有任何功能想法，可以在github存储库（https://github.com/darkmattervale/regex4dummies/issues）上创建一个问题，或者分叉存储库并创建您的添加。

你能提供的任何帮助都是非常感谢的。我们得到的帮助越多，regex4dummies的性能就越好。谢谢你的贡献！

许可证

有关MIT许可证的信息，请参见license.txt

引文

nlpnet:

Fonseca，E.R.和Rosa，J.L.G.重新审视了Mac Morpho：走向稳健的词性标注。第九届巴西信息和人类语言技术研讨会论文集，2013年。第98-107页[pdf]

欢迎加入QQ群-->： 979659372

regex4dummies 1.4.6

regex4dummies的Python项目详细描述

功能

路线图

用法

安装

修补程序注释

贡献

许可证

引文

推荐PyPI第三方库

pyrela

odoo10-addons-oca-l10n-venezuela

writeasapi

hepdata-converter

otrs

padua

navio2

degreeNames.p

tkp

discrete-signals

ESN

keras-retinanet

AILog

whereami

gluish

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

regex4dummies 1.4.6

regex4dummies的Python项目详细描述

功能

路线图

用法

安装

修补程序注释

贡献

许可证

引文

推荐PyPI第三方库

pyrela

odoo10-addons-oca-l10n-venezuela

writeasapi

hepdata-converter

otrs

padua

navio2

degreeNames.p

tkp

discrete-signals

ESN

keras-retinanet

AILog

whereami

gluish

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签