一个快速而轻量级的Python RDF解析器,它使用PyO3将绑定打包到Rust的Rio

lightrdf的Python项目详细描述


灯RDF

PyPI versionPyPI - Downloads

一个快速而轻量级的Python RDF解析器,它使用PyO3将绑定包装到Rust的Rio。在

特点

  • 支持N-Triples、Turtle和RDF/XML
  • 处理大型RDF文档
  • 提供类似HDT的接口

安装

pip install lightrdf

使用

迭代所有三元组(解析器)

^{pr2}$

迭代所有三元组(类似HDT)

importlightrdfdoc=lightrdf.RDFDocument("./go.owl")# ...or lightrdf.RDFDocument("./go.owl", base_iri="", parser=lightrdf.xml.PatternParser) for xml# `None` matches arbitrary termfortripleindoc.search_triples(None,None,None):print(triple)

三重模式(类似HDT)

importlightrdfdoc=lightrdf.RDFDocument("./go.owl")fortripleindoc.search_triples("http://purl.obolibrary.org/obo/GO_0005840",None,None):print(triple)# Output:# ('http://purl.obolibrary.org/obo/GO_0005840', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://www.w3.org/2002/07/owl#Class')# ('http://purl.obolibrary.org/obo/GO_0005840', 'http://www.w3.org/2000/01/rdf-schema#subClassOf', 'http://purl.obolibrary.org/obo/GO_0043232')# ...# ('http://purl.obolibrary.org/obo/GO_0005840', 'http://www.geneontology.org/formats/oboInOwl#inSubset', 'http://purl.obolibrary.org/obo/go#goslim_yeast')# ('http://purl.obolibrary.org/obo/GO_0005840', 'http://www.w3.org/2000/01/rdf-schema#label', '"ribosome"^^<http://www.w3.org/2001/XMLSchema#string>')

提示:用Python打开文件(解析器)

importlightrdfparser=lightrdf.Parser()withopen("./go.owl","rb")asf:fortripleinparser.parse(f,format="owl",base_iri=None):print(triple)
importlightrdfparser=lightrdf.xml.Parser()withopen("./go.owl","rb")asf:fortripleinparser.parse(f,base_iri=None):print(triple)

提示:用Python打开文件(类似HDT)

importlightrdfwithopen("./go.owl","rb")asf:doc=lightrdf.RDFDocument(f,parser=lightrdf.xml.PatternParser)fortripleindoc.search_triples("http://purl.obolibrary.org/obo/GO_0005840",None,None):print(triple)

提示:从字符串解析

importioimportlightrdfdata="""<http://one.example/subject1> <http://one.example/predicate1> <http://one.example/object1> . # comments here# or on a line by themselves_:subject1 <http://an.example/predicate1> "object1" ._:subject2 <http://an.example/predicate2> "object2" ."""doc=lightrdf.RDFDocument(io.BytesIO(data.encode()),parser=lightrdf.turtle.PatternParser)fortripleindoc.search_triples("http://one.example/subject1",None,None):print(triple)

基准(在制品)

On MacBook Air (13-inch, 2017), 1.8 GHz Intel Core i5, 8 GB 1600 MHz DDR3

https://gist.github.com/ozekik/b2ae3be0fcaa59670d4dd4759cdffbed

$ wget -q http://purl.obolibrary.org/obo/go.owl
$ gtime python3 count_triples_rdflib_graph.py ./go.owl  # RDFLib 4.2.21436427235.29user 2.30system 3:59.56elapsed 99%CPU (0avgtext+0avgdata 1055816maxresident)k
0inputs+0outputs (283major+347896minor)pagefaults 0swaps
$ gtime python3 count_triples_lightrdf_rdfdocument.py ./go.owl  # LightRDF 0.1.114364277.90user 0.22system 0:08.27elapsed 98%CPU (0avgtext+0avgdata 163760maxresident)k
0inputs+0outputs (106major+41389minor)pagefaults 0swaps
$ gtime python3 count_triples_lightrdf_parser.py ./go.owl  # LightRDF 0.1.114364278.00user 0.24system 0:08.47elapsed 97%CPU (0avgtext+0avgdata 163748maxresident)k
0inputs+0outputs (106major+41388minor)pagefaults 0swaps

https://gist.github.com/ozekik/636a8fb521401070e02e010ce591fa92

$ wget -q http://downloads.dbpedia.org/2016-10/dbpedia_2016-10.nt
$ gtime python3 count_triples_rdflib_ntparser.py dbpedia_2016-10.nt  # RDFLib 4.2.2310501.63user 0.23system 0:02.47elapsed 75%CPU (0avgtext+0avgdata 26568maxresident)k
0inputs+0outputs (1140major+6118minor)pagefaults 0swaps
$ gtime python3 count_triples_lightrdf_ntparser.py dbpedia_2016-10.nt  # LightRDF 0.1.1310500.21user 0.04system 0:00.36elapsed 71%CPU (0avgtext+0avgdata 7628maxresident)k
0inputs+0outputs (534major+1925minor)pagefaults 0swaps

备选方案

  • RDFLib–(优点)纯Python,成熟,功能丰富/(缺点)加载三元组需要一些时间
  • pyHDT–(优点)非常快速高效/(缺点)需要预转换为HDT

托多

  • [x] 推送到PyPI
  • [x] 采用CI
  • [x] 手柄底座IRI
  • [x] 添加基本测试
  • []支持NQuads和TriG
  • []添加文档
  • []为w3c/rdf-tests添加测试
  • []出错时恢复
  • [x] 允许打开fp

许可证

Rio和{a5}是在Apache-2.0许可下授权的。在

Copyright 2020 Kentaro Ozeki

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java在AlertDialog builder标题右侧放置图标   安装weblogic server12时发生java获取错误。1在windows 10上   java无法导入:安卓。支持v7。小装置。Android Studio中的RecyclerView   java Android应用程序等待通知奇怪行为   java如何比较ArrayList中的整数元素?   java Quartz属性不会触发Quartz作业   java轻松地将许多JavaFX属性绑定到UINode   Maven插件管理器导致java错误消息的原因是什么?   JAXB解组错误后java文件被阻止   java如何在spark kafka流中创建消费者组并将消费者分配给消费者组   java Gps lat&long随机显示非常不准确的结果   java使用assest文件夹文件在Android上执行shell命令   java如何在客户端使用javascript提取文本   java扩展描述的distincts之和   java重写默认Spring数据REST配置   java SQL未命名参数语法   二进制搜索任务的java真实解决方案   java在一个多模块多数据源项目中,用什么正确的方式来指示将哪个数据源注入我的DAOs?