GTF解析
gtfparse的Python项目详细描述
gtfparse
GTF(基因转移格式)文件的分析工具。
示例用法
将gtf文件的所有行解析为pandas数据帧
fromgtfparseimportread_gtf# returns GTF with essential columns such as "feature", "seqname", "start", "end"# alongside the names of any optional keys which appeared in the attribute columndf=read_gtf("gene_annotations.gtf")# filter DataFrame to gene entries on chrYdf_genes=df[df["feature"]=="gene"]df_genes_chrY=df_genes[df_genes["seqname"]=="Y"]
从stringtie gtf文件获取gene fpkm值
fromgtfparseimportread_gtfdf=read_gtf("stringtie-output.gtf",column_converters={"FPKM":float})gene_fpkms={gene_name:fpkmfor(gene_name,fpkm,feature)inzip(df["gene_name"],df["FPKM"],df["feature"])iffeature=="gene"}