用于自动评估woudc数据质量的python包。
woudc-qa的Python项目详细描述
woudc质量评估库
用于自动质量评估的python包 WOUDC基于定义的规则的数据。
安装
要求
woudc qa需要python 2.7。
依赖关系
见requirements.txt。
安装软件包
# via distutils
pip install -r requirements.txt
python setup.py install
使用量
命令行界面
usage: woudc-qa.py [-h] --file FILE
Execute Qa.
optional arguments:
-h, --help show this help message and exit
--file FILE Path to extended CSV file to be quality assessed.
示例
fromwoudc_qaimportqafile_s=open(<pathtoyourextendedCSVfile.>).read()qa_results=qa(file_s)# qa_results is a dictionary as such:# qa_results: {# filename: {# test_id: {# row : {# result: result of this test, pass/fail/None/NR,# table: table name,# table_index: table_index,# element: element name,# related_test_id: test_id,# related_test_result: related tests result, pass/fail/None/NR# precond : precondition result: pass/fail/None/NR# }# }# }# }# where,# 'filename' is the name of the file, default it to 'file1'# 'test_id' is the test identifier from the test definition# 'row' is the row number of the element under assessmet. Always 1 for non profile/payload element# 'result', is the result of the assessment for the element at the indicated row for the given test# 'table' is the name of the table where the element under assessment is found# 'table_index' is the index of the above table. Default to 1, index will be incremented by 1 to handle multicipity# 'element' is the element under assessment# 'related_test_id' is a listing of any related test to this test# 'related_test_result' is a aggregated result of all related tests to this test# 'precond' is the aggregated result of any precondition checks## from collections import OrderedDict# test_result = qa_result[<filename>][<test_id>]# iterate over test results by row:# for row, result in test_result.iteritems():# print row, result# get result of assessment at a specific row# row_result = qa_results[<filename>][<test_id>][<row number>]['result']
发展
对于开发环境,在python中安装 virtualenv:
virtualenv foo cd foo . bin/activate # fork master # fork http://github.com/woudc/woudc-qa on GitHub # clone your fork to create a branch git clone https://github.com/{your GitHub username}/woudc-qa.git cd woudc-qa # install dev packages pip install -r requirements.txt python setup.py install # create upstream remote git remote add upstream https://github.com/woudc/woudc-qa.git git pull upstream master git branch my-cool-feature git checkout my-cool-feature # start dev git commit -m 'implement cool feature'# push to your fork git push origin my-cool-feature # issue Pull Request on GitHub git checkout master # cleanup/update once your branch is merged on GitHub # remove branch git branch -D my-cool-feature # update your fork git pull upstream master git push origin master
运行测试
# via distutils
python setup.py test# manually
python run_tests.py
# report test coverage
coverage run --source woudc_qa setup.py test
coverage report -m
代码约定
质量保证规范是否符合 PEP8。
# code should always pass the following find -type f -name "*.py"| xargs flake8
问题
所有错误、增强和问题都在 GitHub。