正在访问!实体语句和引用

2024-05-15 00:01:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一些xml文件!实体定义和文件参考

我能成功地处理这些。你知道吗

不过,我想预处理的文件和访问!实体定义来提取文件名和文件引用,以及它们在xml的哪个部分

示例XML文件如下所示

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE gdml [
    <!ENTITY materials SYSTEM "materialsOptical.xml"> 
    <!ENTITY solids_Mainz_v2 SYSTEM "solids_Mainz_v2.xml"> 
    <!ENTITY matrices_Mainz_v2 SYSTEM "matrices_Mainz_v2.xml">
]> 

<gdml xmlns:gdml="http://cern.ch/2001/Schemas/GDML"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"    xsi:noNamespaceSchemaLocation="schema/gdml.xsd">


<define>
<constant name="PI" value="1.*pi"/>
&matrices_Mainz_v2;
</define>
&materials; 
&solids_Mainz_v2;

<structure>
.... continued...

我的编码尝试看起来像

class MyHTMLParser(HTMLParser):

    def handle_starttag(self, tag, attrs):
        #print "Encountered a start tag:", tag
        # Entity may not be in latest tag so handle outselves
        global currentSection
        if tag in ["define","materials","solids","structure"] :
           print("Section : "+tag)
           currentSection = tag

    def handle_decl(self, decl):
        # This gets called when the entity is declared
        print ("Encountered an Entity declaration ", decl)
        words = decl.split()
        wlen  = len(words)
        print (words)
        if words[3] == "<!ENTITY" and wlen == 7:
           # const that refers to a file
           word = words[6].split('"')[1]
           print("Entity "+words[4]+" : "+word)
           filesDict[words[4]] = word


    def handle_entityref(self, name):
        # This gets called when the entity is referenced
        # starttag may not be a section
        print ("Entity reference : "+ name)
        #tag = self.get_starttag_text()
        print ("Current Section  : "+ currentSection)
        FilesEntity = True
        sectionDict[currentSection] = filesDict[name]


#    def handle_endtag(self, tag):
#        print "Encountered an end tag :", tag

#    def handle_data(self, data):
#        print "Encountered some data  :", data

    def unknown_decl(data):
        print ("Encountered unknown data  :", data)

def preprocessHTML(doc,filename):
    # Add files object so user can change to organise files
    # from GDMLObjects import GDMLFiles, ViewProvider
    print ("Preprocessing file for Entities File Definitions")
    global FilesEntity, filesDict, sectionDict
    FilesEntity = False
    sectionDict = {} # Empty Dict
    filesDict = {}
    fp = io.open(filename)
    parser = MyHTMLParser()
    parser.feed(fp.read())
    # myfiles = doc.addObject("App::FeaturePython","Export_Files")
    # GDMLFiles(myfiles,FilesEntity,sectionDict)
    print("End of Preprocessing")

当我运行它时,它只在第一个实体上起作用

Preprocessing file for Entities File Definitions
Encountered an Entity declaration  DOCTYPE gdml [
    <!ENTITY materials SYSTEM "materialsOptical.xml"
['DOCTYPE', 'gdml', '[', '<!ENTITY', 'materials', 'SYSTEM', '"materialsOptical.xml"']
Entity materials : materialsOptical.xml
Section : define
Section : structure
End of Preprocessing

Tags: selfdatadeftagxmlv2entityhandle

热门问题