C#中的高级SAX解析器
下面是XML的结构。我想把它以行和列的方式展示出来。
我需要的是把这个XML文件转换成一个哈希表,像这样:
{"form" : {"attrs" : { "string" : " Partners" }
{"child1": { "group" : { "attrs" : { "col" : "6", "colspan":"1" } }
{ "child1": { "field" : { "attrs" : { "name":"name"}
{ "child2": { "field" : { "attrs" : { "name":"ref"} }
{"child2": { "notebook" : "attrs" : {"colspan": 4} } }
}
<?xml version="1.0" encoding="utf-8"?>
<form string="Partners">
<group col="6" colspan="4">
<field name="name" select="1"/>
<field name="ref" select="1"/>
<field name="customer" select="1"/>
<field domain="[('domain', '=', 'partner')]" name="title"/>
<field name="lang" select="2"/>
<field name="supplier" select="2"/>
</group>
<notebook colspan="4">
<page string="General">
<field colspan="4" mode="form,tree" name="address" nolabel="1" select="1">
</field>
<separator colspan="4" string="Categories"/>
<field colspan="4" name="category_id" nolabel="1" select="2"/>
</page>
<page string="Sales & Purchases">
<separator colspan="4" string="General Information"/>
<field name="user_id" select="2"/>
<field name="active" select="2"/>
<field name="website" widget="url"/>
<field name="date" select="2"/>
<field name="parent_id"/>
<newline/>
<newline/><group col="2" colspan="2" name="sale_list">
<separator colspan="2" string="Sales Properties"/>
<field name="property_product_pricelist"/>
</group><group col="2" colspan="2">
<separator colspan="2" string="Purchases Properties"/>
<field name="property_product_pricelist_purchase"/>
</group><group col="2" colspan="2">
<separator colspan="2" string="Stock Properties"/>
<field name="property_stock_customer"/>
<field name="property_stock_supplier"/>
</group></page>
<page string="History">
<field colspan="4" name="events" nolabel="1" widget="one2many_list"/>
</page>
<page string="Notes">
<field colspan="4" name="comment" nolabel="1"/>
</page>
<page position="inside" string="Accounting">
<group col="2" colspan="2">
<separator colspan="2" string="Customer Accounting Properties"/>
<field name="property_account_receivable"/>
<field name="property_account_position"/><field name="vat" on_change="vat_change(vat)" select="2"/><field name="vat_subjected"/>
<field name="property_payment_term"/>
</group>
<group col="2" colspan="2">
<separator colspan="2" string="Supplier Accounting Properties"/>
<field name="property_account_payable"/>
</group>
<group col="2" colspan="2">
<separator colspan="2" string="Customer Credit"/>
<field name="credit" select="2"/>
<field name="credit_limit" select="2"/>
</group>
<group col="2" colspan="2">
<separator colspan="2" string="Supplier Debit"/>
<field name="debit" select="2"/>
</group>
<field colspan="4" context="address=address" name="bank_ids" nolabel="1" select="2">
</field>
</page>
</notebook>
</form>
1 个回答
0
从xml.sax.handler导入ContentHandler,然后导入xml。接下来定义一个叫做my_handler的类,继承自ContentHandler。
def get_attr_dict(self, attrs): ret_dict = {} for name in attrs.getNames(): ret_dict[name] = attrs.getValue(name) #end for name in attrs.getNames(): return ret_dict def setDocumentLocator(self, locator): print "DOCUMEN T LOOCATOR" pass def startDocument(self): self.my_data = {} self.my_stack = [] print "SATART DUASDFASD:" def startElement(self, name, attrs): attr_dict = self.get_attr_dict(attrs) myname = name!='field' and name or attr_dict['name'] append_dict = { 'attrs' : attr_dict, 'childs' : [] } if not self.my_data: self.my_data[name] = append_dict else: last_dict = {} for x in self.my_stack: if last_dict: last_dict = isinstance(last_dict, list) and
如果last_dict的最后一个元素中有x这个键,或者last_dict中有x这个键,就用它;否则,就把last_dict设置为self.my_data中的x对应的值。然后在last_dict中添加一个字典,这个字典的键是myname,值是append_dict。
self.my_stack.extend([myname, 'childs']) def endElement(self, name): self.my_stack = self.my_stack[:-2] print "ENDS ELERMERE :",name def endDocument(self): print "Sfled :",self.my_data print "ENDA DAFASDFASD"
如果name等于'main',那么就打开一个名为'Form.xml'的文件,模式是'r'(只读)。接着用xml.sax.parse来解析这个文件,传入my_handler()作为处理器。最后关闭文件。
最后,这个问题不是用C#解决的,而是用Python脚本解决的,我在这里分享这个脚本。谢谢大家。