针对ApacheJena Fuseki的python中的listOfDict到RDF转换

2024-05-16 16:08:36 发布

您现在位置:Python中文网/ 问答频道 /正文

为了从python在ApacheJena中存储一些数据,我想进行一个从DICT列表到RDF的通用转换,并可能返回查询

对于Dict到RDF的列表,我尝试了实现“insertListofDicts”(见下文) 并使用“testListOfDictInsert”对其进行了测试(见下文)。 结果如下,当使用ApacheJena Fuseki服务器进行尝试时,会导致400:Bad请求

对于简单字符串类型,以及对于其他基本Python类型,需要修复哪些问题才能使其正常工作?

请在以下网址查找源代码:

@prefix foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#name "Elizabeth Alexandra Mary Windsor".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#born "1926-04-21".
foaf:Person/Elizabeth+Alexandra+Mary+Windsor foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q9682".
foaf:Person/George+of+Cambridge foaf:Person#name "George of Cambridge".
foaf:Person/George+of+Cambridge foaf:Person#born "2013-07-22".
foaf:Person/George+of+Cambridge foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q1359041".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#name "Harry Duke of Sussex".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#born "1984-09-15".
foaf:Person/Harry+Duke+of+Sussex foaf:Person#wikidataurl "https://www.wikidata.org/wiki/Q152316".

}

testListOfDictInsert

def testListOfDictInsert(self):
        '''
        test inserting a list of Dicts using FOAF example
        https://en.wikipedia.org/wiki/FOAF_(ontology)
        '''
        listofDicts=[
            {'name': 'Elizabeth Alexandra Mary Windsor', 'born': '1926-04-21', 'age': 94, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
            {'name': 'George of Cambridge',              'born': '2013-07-22', 'age':  7, 'ofAge': False, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
            {'name': 'Harry Duke of Sussex',             'born': '1984-09-15', 'age': 36, 'ofAge': True , 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
        ]
        jena=self.getJena(mode='update',debug=True)
        jena.insertListOfDicts(listofDicts,'foaf:Person','name','@prefix foaf: <http://xmlns.com/foaf/0.1/>')

插入图片列表

def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
        '''
        insert the given list of dicts mapping datatypes according to
        https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
        
        mapped from 
        https://docs.python.org/3/library/stdtypes.html
        
        compare to
        https://www.w3.org/2001/sw/rdb2rdf/directGraph/
        http://www.bobdc.com/blog/json2rdf/
        https://www.w3.org/TR/json-ld11-api/#data-round-tripping
        https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
        '''
        errors=[]
        insertCommand='%s\nINSERT DATA {\n' % prefixes
        for index,record in enumerate(listOfDicts):
            if not primaryKey in record:
                errors.append["missing primary key %s in record %d",index]
            else:    
                primaryValue=record[primaryKey]
                encodedPrimaryValue=urllib.parse.quote_plus(primaryValue)
                tSubject="%s/%s" %(entityType,encodedPrimaryValue)
                for keyValue in record.items():
                    key,value=keyValue
                    valueType=type(value)
                    if self.debug:
                        print("%s(%s)=%s" % (key,valueType,value))
                    tPredicate="%s#%s" % (entityType,key)
                    tObject=value    
                    if valueType == str:   
                        insertCommand+='  %s %s "%s".\n' % (tSubject,tPredicate,tObject)
        insertCommand+="\n}"
        if self.debug:
            print (insertCommand)
        self.insert(insertCommand)
        return errors

Tags: ofnameinhttpsorgselfcomwww
2条回答

以下代码至少可以工作,并且具有正确的“往返”行为。从DICT列表插入的数据可以使用相应的quer检索。请评论更多改进或添加更好的答案

如果您总是希望获得typedLiterals,那么现在可以在Jena包装器类的构造函数中指定它

在类型化文字模式下,单元测试插入为:

类型

  • 整数
  • 十进制

用于正确的“往返”行为的数字文字

PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine "0"^^<http://www.w3.org/2001/XMLSchema#integer>.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age "94.32637220476806"^^<http://www.w3.org/2001/XMLSchema#decimal>.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
  foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
  foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine "1"^^<http://www.w3.org/2001/XMLSchema#integer>.
  foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
  foafo:Person_CharlesPrinceofWales foafo:Person_age "71.7578047461618"^^<http://www.w3.org/2001/XMLSchema#decimal>.
  foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
  foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
  foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_GeorgeofCambridge foafo:Person_numberInLine "3"^^<http://www.w3.org/2001/XMLSchema#integer>.
  foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
  foafo:Person_GeorgeofCambridge foafo:Person_age "7.072013799051315"^^<http://www.w3.org/2001/XMLSchema#decimal>.
  foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
  foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
  foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_HarryDukeofSussex foafo:Person_numberInLine "5"^^<http://www.w3.org/2001/XMLSchema#integer>.
  foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
  foafo:Person_HarryDukeofSussex foafo:Person_age "35.92133993168922"^^<http://www.w3.org/2001/XMLSchema#decimal>.
  foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.
}

当文字模式为off类型时,文字仅用于日期:

PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
INSERT DATA {
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_name "Elizabeth Alexandra Mary Windsor".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_born "1926-04-21"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_numberInLine 0.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q9682".
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_age 94.32637220476806.
  foafo:Person_ElizabethAlexandraMaryWindsor foafo:Person_ofAge True.
  foafo:Person_CharlesPrinceofWales foafo:Person_name "Charles, Prince of Wales".
  foafo:Person_CharlesPrinceofWales foafo:Person_born "1948-11-14"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_CharlesPrinceofWales foafo:Person_numberInLine 1.
  foafo:Person_CharlesPrinceofWales foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q43274".
  foafo:Person_CharlesPrinceofWales foafo:Person_age 71.7578047461618.
  foafo:Person_CharlesPrinceofWales foafo:Person_ofAge True.
  foafo:Person_GeorgeofCambridge foafo:Person_name "George of Cambridge".
  foafo:Person_GeorgeofCambridge foafo:Person_born "2013-07-22"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_GeorgeofCambridge foafo:Person_numberInLine 3.
  foafo:Person_GeorgeofCambridge foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q1359041".
  foafo:Person_GeorgeofCambridge foafo:Person_age 7.072013799051315.
  foafo:Person_GeorgeofCambridge foafo:Person_ofAge False.
  foafo:Person_HarryDukeofSussex foafo:Person_name "Harry Duke of Sussex".
  foafo:Person_HarryDukeofSussex foafo:Person_born "1984-09-15"^^<http://www.w3.org/2001/XMLSchema#date>.
  foafo:Person_HarryDukeofSussex foafo:Person_numberInLine 5.
  foafo:Person_HarryDukeofSussex foafo:Person_wikidataurl "https://www.wikidata.org/wiki/Q152316".
  foafo:Person_HarryDukeofSussex foafo:Person_age 35.92133993168922.
  foafo:Person_HarryDukeofSussex foafo:Person_ofAge True.

}

testListOfDictInsert

 def testListOfDictInsert(self):
        '''
        test inserting a list of Dicts and retrieving the values again
        using a person based example
        instead of
        https://en.wikipedia.org/wiki/FOAF_(ontology)
        
        we use an object oriented derivate of FOAF with a focus on datatypes
        '''
        listofDicts=[
            {'name': 'Elizabeth Alexandra Mary Windsor', 'born': self.dob('1926-04-21'), 'numberInLine': 0, 'wikidataurl': 'https://www.wikidata.org/wiki/Q9682' },
            {'name': 'Charles, Prince of Wales',         'born': self.dob('1948-11-14'), 'numberInLine': 1, 'wikidataurl': 'https://www.wikidata.org/wiki/Q43274' },
            {'name': 'George of Cambridge',              'born': self.dob('2013-07-22'), 'numberInLine': 3, 'wikidataurl': 'https://www.wikidata.org/wiki/Q1359041'},
            {'name': 'Harry Duke of Sussex',             'born': self.dob('1984-09-15'), 'numberInLine': 5, 'wikidataurl': 'https://www.wikidata.org/wiki/Q152316'}
        ]
        today=date.today()
        for person in listofDicts:
            born=person['born']
            age=(today - born).days / 365.2425
            person['age']=age
            person['ofAge']=age>=18
        typedLiteralModes=[True,False]
        entityType='foafo:Person'
        primaryKey='name'
        prefixes='PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>'
        for typedLiteralMode in typedLiteralModes:
            jena=self.getJena(mode='update',typedLiterals=typedLiteralMode,debug=True)
            errors=jena.insertListOfDicts(listofDicts,entityType,primaryKey,prefixes)
            self.checkErrors(errors)
            
        jena=self.getJena(mode="query")    
        queryString = """
        PREFIX foafo: <http://foafo.bitplan.com/foafo/0.1/>
        SELECT ?name ?born ?numberInLine ?wikidataurl ?ofAge ?age WHERE { 
            ?person foafo:Person_name ?name.
            ?person foafo:Person_born ?born.
            ?person foafo:Person_numberInLine ?numberInLine.
            ?person foafo:Person_wikidataurl ?wikidataurl.
            ?person foafo:Person_ofAge ?ofAge.
            ?person foafo:Person_age ?age. 
        }"""
        personResults=jena.query(queryString)
        self.assertEqual(len(listofDicts),len(personResults))
        personList=jena.asListOfDicts(personResults)   
        for index,person in enumerate(personList):
            print("%d: %s" %(index,person))
        # check the correct round-trip behavior
        self.assertEqual(listofDicts,personList)

插入图片列表

def insertListOfDicts(self,listOfDicts,entityType,primaryKey,prefixes):
        '''
        insert the given list of dicts mapping datatypes according to
        https://www.w3.org/TR/xmlschema-2/#built-in-datatypes
        
        mapped from 
        https://docs.python.org/3/library/stdtypes.html
        
        compare to
        https://www.w3.org/2001/sw/rdb2rdf/directGraph/
        http://www.bobdc.com/blog/json2rdf/
        https://www.w3.org/TR/json-ld11-api/#data-round-tripping
        https://stackoverflow.com/questions/29030231/json-to-rdf-xml-file-in-python
        '''
        errors=[]
        insertCommand='%s\nINSERT DATA {\n' % prefixes
        for index,record in enumerate(listOfDicts):
            if not primaryKey in record:
                errors.append["missing primary key %s in record %d",index]
            else:    
                primaryValue=record[primaryKey]
                encodedPrimaryValue=self.getLocalName(primaryValue)
                tSubject="%s_%s" %(entityType,encodedPrimaryValue)
                for keyValue in record.items():
                    key,value=keyValue
                    valueType=type(value)
                    if self.debug:
                        print("%s(%s)=%s" % (key,valueType,value))
                    tPredicate="%s_%s" % (entityType,key)
                    tObject=value    
                    if valueType == str:   
                        tObject='"%s"' % value
                    elif valueType==int:
                        if self.typedLiterals:
                            tObject='"%d"^^<http://www.w3.org/2001/XMLSchema#integer>' %value
                        pass
                    elif valueType==float:
                        if self.typedLiterals:
                            tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#decimal>' %value
                        pass
                    elif valueType==bool:
                        pass
                    elif valueType==datetime.date:
                        #if self.typedLiterals:
                        tObject='"%s"^^<http://www.w3.org/2001/XMLSchema#date>' %value
                        pass
                    else:
                        errors.append("can't handle type %s in record %d" % (valueType,index))
                        tObject=None
                    if tObject is not None:    
                        insertCommand+='  %s %s %s.\n' % (tSubject,tPredicate,tObject)
        insertCommand+="\n}"
        if self.debug:
            print (insertCommand)
        self.insert(insertCommand)
        return errors

+是HTTP表单编码中空格的特殊字符,但它只应在application/x-www-form-urlencoded中使用

对于URI,使用%20或决定替换字符,例如_作为空格,因为它看起来有点像空格

在所有这些情况下,URI中没有空格字符-有+%20(三个字符)或_。这是一种编码,而不是一种逃避机制

相关问题 更多 >