哪个语言处理XML内容最简单和最快?
我们有一些开发人员懂得这些编程语言 - Ruby、Python、.Net 或 Java。我们正在开发一个主要处理 XML 文档的应用程序。大部分工作是把预先定义好的 XML 文件转换成数据库表格,通过数据库建立 XML 文档之间的关系,生成数据库报告等等。请问哪种语言使用起来最简单、最快呢?(这是一个网页应用)
9 个回答
8
在.NET中,C# 3.0和VB9对使用LINQ来处理XML提供了很好的支持:
8
20
动态语言在这方面有它的规则。为什么呢?因为这些映射关系很容易编写和修改。你不需要重新编译和重建整个程序。
其实,只要稍微动动脑筋,你就可以把“XML XPATH到标签再到数据库字段”的映射关系写成独立的Python代码块,然后在你的主应用程序中引入这些代码块。
这段Python代码就是你的配置文件。它不是一个描述配置的.ini
文件或.properties
文件,而就是配置本身。
我们使用Python、xml.etree和SQLAlchemy(把SQL从你的程序中分离出来)来实现这一点,因为这样做既省力又灵活。
source.py
"""A particular XML parser. Formats change, so sometimes this changes, too."""
import xml.etree.ElementTree as xml
class SSXML_Source( object ):
ns0= "urn:schemas-microsoft-com:office:spreadsheet"
ns1= "urn:schemas-microsoft-com:office:excel"
def __init__( self, aFileName, *sheets ):
"""Initialize a XML source.
XXX - Create better sheet filtering here, in the constructor.
@param aFileName: the file name.
"""
super( SSXML_Source, self ).__init__( aFileName )
self.log= logging.getLogger( "source.PCIX_XLS" )
self.dom= etree.parse( aFileName ).getroot()
def sheets( self ):
for wb in self.dom.getiterator("{%s}Workbook" % ( self.ns0, ) ):
for ws in wb.getiterator( "{%s}Worksheet" % ( self.ns0, ) ):
yield ws
def rows( self ):
for s in self.sheets():
print s.attrib["{%s}Name" % ( self.ns0, ) ]
for t in s.getiterator( "{%s}Table" % ( self.ns0, ) ):
for r in t.getiterator( "{%s}Row" % ( self.ns0, ) ):
# The XML may not be really useful.
# In some cases, you may have to convert to something useful
yield r
model.py
"""This is your target object.
It's part of the problem domain; it rarely changes.
"""
class MyTargetObject( object ):
def __init__( self ):
self.someAttr= ""
self.anotherAttr= ""
self.this= 0
self.that= 3.14159
def aMethod( self ):
"""etc."""
pass
builder_today.py 这是众多映射配置中的一个
"""One of many builders. This changes all the time to fit
specific needs and situations. The goal is to keep this
short and to-the-point so that it has the mapping and nothing
but the mapping.
"""
import model
class MyTargetBuilder( object ):
def makeFromXML( self, element ):
result= model.MyTargetObject()
result.someAttr= element.findtext( "Some" )
result.anotherAttr= element.findtext( "Another" )
result.this= int( element.findtext( "This" ) )
result.that= float( element.findtext( "that" ) )
return result
loader.py
"""An application that maps from XML to the domain object
using a configurable "builder".
"""
import model
import source
import builder_1
import builder_2
import builder_today
# Configure this: pick a builder is appropriate for the data:
b= builder_today.MyTargetBuilder()
s= source.SSXML_Source( sys.argv[1] )
for r in s.rows():
data= b.makeFromXML( r )
# ... persist data with a DB save or file write
要进行更改,你可以修正一个构建器或者创建一个新的构建器。你只需调整加载器的源代码,以确定使用哪个构建器。你甚至可以轻松地把构建器的选择作为命令行参数。虽然在动态语言中动态导入看起来有点多余,但它们确实很方便。