与Spacy DependencyTreeMatcher一起使用的逆向工程模式

spacy-pattern-builder的Python项目详细描述


Spacy模式生成器

使用培训示例来构建和优化与Spacy的DependencyMatcher一起使用的模式。

动机

从训练数据以编程方式生成模式比手动创建模式更有效。

安装

使用pip:

pip install spacy-pattern-builder

用法

# Import a SpaCy model, parse a string to create a Doc objectimporten_core_web_smtext='We introduce efficient methods for fitting Boolean models to molecular data.'nlp=en_core_web_sm.load()doc=nlp(text)fromspacy_pattern_builderimportbuild_dependency_pattern# Provide a list of tokens we want to match.match_tokens=[doc[i]foriin[0,1,3]]# [We, introduce, methods]''' Note that these tokens must be fully connected. That is,all tokens must have a path to all other tokens in the list,without needing to traverse tokens outside of the list.Otherwise, spacy-pattern-builder will raise a TokensNotFullyConnectedError.You can get a connected set that includes your tokens with the following: '''fromspacy_pattern_builderimportutilconnected_tokens=util.smallest_connected_subgraph(match_tokens,doc)assertmatch_tokens==connected_tokens# In this case, the tokens we provided are already fully connected# Specify the token attributes / features to usefeature_dict={# This is equal to the default feature_dict'DEP':'dep_','TAG':'tag_'}# Build the patternpattern=build_dependency_pattern(doc,match_tokens,feature_dict=feature_dict)frompprintimportpprintpprint(pattern)# In the format consumed by SpaCy's DependencyMatcher:'''[{'PATTERN': {'DEP': 'ROOT', 'TAG': 'VBP'}, 'SPEC': {'NODE_NAME': 'node1'}}, {'PATTERN': {'DEP': 'nsubj', 'TAG': 'PRP'},  'SPEC': {'NBOR_NAME': 'node1', 'NBOR_RELOP': '>', 'NODE_NAME': 'node0'}}, {'PATTERN': {'DEP': 'dobj', 'TAG': 'NNS'},  'SPEC': {'NBOR_NAME': 'node1', 'NBOR_RELOP': '>', 'NODE_NAME': 'node3'}}]'''# Create a matcher and add the newly generated patternfromspacy.matcherimportDependencyMatchermatcher=DependencyTreeMatcher(doc.vocab)matcher.add('pattern',None,pattern)# And get matchesmatches=matcher(doc)formatch_id,token_idxsinmatches:tokens=[doc[i]foriintoken_idxs]tokens=sorted(tokens,key=lambdaw:w.i)# Make sure tokens are in their original orderprint(tokens)# [We, introduce, methods]

致谢

用途:

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
swing在Java中使用GridBagLayout   java如何在fxml中为单选按钮设置负数?   itextsharp:java到vb。网   java创建映射<String,List<Integer>>   java Tomcat 9/JNDI数据源无法为连接URL“null”创建类“”的JDBC驱动程序   java中未知变量方程的数学求解   java如果在Web服务器上运行,rand(时间戳)是如何工作的?   java如何将行从数据库传输到JTable,并使用编辑JTable字段更新数据库?   java Rubik的立方体模拟器故障   java我想数一数我在测验应用程序中提交的每一个对错答案,并想在文本视图中显示分数   java PostgreSQL是否支持流式保存字节数组数据?   javafx:如何向ListView添加颜色选择器?   当Nginx入口重新加载以进行POST时,java Apache HTTP客户端抛出NoHttpResponseException   java如何使用HttpClient处理HTTP/2 GOAWAY?