将元数据摄取到candig存储库的例程

candig-ingest的Python项目详细描述


将元数据摄取到Candig1.0服务器的例程 需要[Candig服务器](https://github.com/candig/candig-server), [docopt](http://docopt.readthedocs.io/en/latest/) 还有[熊猫](https://github.com/pandas-dev/pandas)。

  • 自由软件:GNU通用公共许可v3

您可以运行摄取,并使用生成的repo测试服务器,如下所示 (Candig服务器需要Python2.7<;1.0.0,Candig服务器需要Python3.6>;=1.0.0,请注意,当前不支持Python3.7。)

# Install
virtualenv test_server # If you are running Python 2
python3 -m venv test_server # If you are running Python 3.6
cd test_server
source bin/activate
pip install --upgrade pip setuptools
pip install candig-server # Specify anything <1.0.0 for Python 2.7, or >=1.0.0 for Python 3.6.
pip install candig-ingest

# ingest data and make the repo
mkdir ga4gh-example-data
ingest ga4gh-example-data/registry.db <path to example data, like: mock_data/clinical_metadata_tier1.json>

# optional
# add peer site addresses
candig_repo add-peer ga4gh-example-data/registry.db <peer site IP address, like: http://127.0.0.1:8001>

# optional
# create dataset for reads and variants
candig_repo add-dataset --description "Reads and variants dataset" ga4gh-example-data/registry.db read_and_variats_dataset

# optinal
# add reference set, data source: https://www.ncbi.nlm.nih.gov/grc/human/ or http://genome.wustl.edu/pub/reference/
candig_repo add-referenceset ga4gh-example-data/registry.db <path to downloaded reference set, like GRCh37-lite.fa> -d "GRCh37-lite human reference genome" --name GRCh37-lite --sourceUri "http://genome.wustl.edu/pub/reference/GRCh37-lite/GRCh37-lite.fa.gz"# optional
# add reads
candig_repo add-readgroupset -r -I <path to bam index file> -R GRCh37-lite ga4gh-example-data/registry.db read_and_variats_dataset <path to bam file>

# optional
# add variants
candig_repo add-variantset -I <path to variants index file> -R GRCh37-lite ga4gh-example-data/registry.db read_and_variats_dataset <path to vcf file>

# optional
# add sequence ontology set
# wget https://raw.githubusercontent.com/The-Sequence-Ontology/SO-Ontologies/master/so.obo
candig_repo add-ontology ga4gh-example-data/registry.db <path to sequence ontology set, like: so.obo> -n so-xp

# optional
# add features/annotations
#
## get the following scripts
# https://github.com/ga4gh/ga4gh-server/blob/master/scripts/glue.py
# https://github.com/ga4gh/ga4gh-server/blob/master/scripts/generate_gff3_db.py
#
## download the relevant annotation release from Gencode
# https://www.gencodegenes.org/releases/current.html
#
## decompress
# gunzip gencode.v27.annotation.gff3.gz
#
## buid the annotation database
# python generate_gff3_db.py -i gencode.v27.annotation.gff3 -o gencode.v27.annotation.db -v
#
# add featureset
candig_repo add-featureset ga4gh-example-data/registry.db read_and_variats_dataset <path to the annotation.db> -R GRCh37-lite -O so-xp

# optional
# add phenotype association set from Monarch Initiative
# wget http://nif-crawler.neuinfo.org/monarch/ttl/cgd.ttl
candig_repo add-phenotypeassociationset ga4gh-example-data/registry.db read_and_variats_dataset <path to the folder containing cdg.ttl>

# optional
# add disease ontology set, like: NCIT
# wget http://purl.obolibrary.org/obo/ncit.obo
candig_repo add-ontology -n NCIT ga4gh-example-data/registry.db ncit.obo

# launch the server
# at different IP and/or port: ga4gh_server --host 127.0.0.1 --port 8000
candig_server --host 127.0.0.1 --port 8000 -c NoAuth


https://127.0.0.1:8000/

然后,从另一个终端:

curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json'\
    http://127.0.0.1:8000/datasets/search \
| jq '.'

给予:

{"datasets":[{"description":"PROFYLE test metadata","id":"WyJQUk9GWUxFIl0","name":"PROFYLE"}]}

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
JavaCenter是JavaFX2中不可执行的窗格   java Docker将容器连接到本地数据库   java无法通过Spring从Redis获得正确的值   java为什么要将数组转换为列表再转换为数组   Java泛型在方法中放入字符串或整数参数   在Cloud Bigtable SDK中找不到java类   java Mavensiteplugin无法加载生成的源(Jaxb)   java GWT序列化和Appengine通道Api   PrintWriter out=new PrintWriter(sWriter)和PrintWriter out=response之间的java差异。getWriter()   空手道DSL中的javascript,在java参数调用中传递变量时如何转义单引号   windows 64位java可以与32位tomcat一起使用   为for循环| Java输入参数   java您可以更改SWT选项卡项的背景和/或前景颜色吗?   用java生成6位pin码   java如何从另一个通量中排除通量中的所有元素   java无法调用“javafx.scene.control.ComboBox.getItems()”,因为“Controller.getMyBox()”的返回值为null   将Java字符串与数组匹配时出现问题   java如何使用HttpsUrlConnection对SSL连接使用代理身份验证?   java如何检查XML元素是否包含CDATA?