提取/转换灯光-一个用于读取分隔文件的简单库。
etlite的Python项目详细描述
etlite
提取/转换灯光-一个用于读取分隔文件的简单库。
示例
给定csv文件:
Area id,Male,Female,Area
A12345,34,45,0.25
A12346,108,99,0.32
定义转换列表:
transformations=[# Map existing fields into dictionary.# For nested dictionaries use dot.delimited.keys.# Optional "via" parameter takes a callable returning transformed value.{"from":"Area id","to":"id"},{"from":"Male","to":"population.male","via":int},{"from":"Female","to":"population.female","via":int},{"from":"Area","to":"area","via":float},# You can also add computed values, not present in the original data source.# Computer values take transformed dictionary as argument# and they do not require "from" parameter:{"to":"population.total","via":lambdax:x['population']['male']+x['population']['female']},# Note that transformations are executed in the order they were defined.# This transformation uses population.total value computed in the previous step:{"to":'population.density',"via":lambdax:round(x['population']['total']/x['area']),}]
读取文件:
frometliteimportdelim_readerwithopen("mydatafile.csv")ascsvfile:reader=delim_reader(csvfile,transformations)data=[rowforrowinreader]
这将生成字典列表:
[{'id':'A12345','area':0.25,'population':{'male':34,'female':45,'total':79,'density':316}},{'id':'A12346','area':0.32,'population':{'male':108,'female':99,'total':207,'density':647}}]
delim_reader
选项
etlite只是python内置CSV module之上的一个薄包装。因此,您可以传递给delim_reader
与传递给csv.reader
相同的选项。例如:
reader=delim_reader(csvfile,transformations,delimiter="\t")
异常处理
如果无法执行所需的转换,etlite将升高TransformationError
。如果不想中止数据加载,可以将错误处理程序传递给delim_reader
。
错误处理程序必须是函数。它将通过一个TransformationError
实例。注意:on_error
必须作为keywod参数Pased。
frometliteimportdelim_readertransformations=[# ...]deferror_handler(err):# err is an instance of TransformationErrorprint(err)# prints error messageprint(err.record)# prints raw record, prior to transformationwithopen('my-data.csv')asstream:reader=delim_reader(stream,transformations,on_error=error_handler)forrowinreader:do_something(row)