从pubmed中搜索作者从属关系以获取pubmed id和doi的列表

pubmed-author-affiliation的Python项目详细描述


命令行工具获取一个或一个pubmed id或doi列表, 在PubMed中搜索相应的作者从属关系和 将信息输出到文件

命令行选项

-h, --helpShow help text
-i PUBMEDID, --pubmedid PUBMEDID
Search for author affiliations for single Pubmed ID
-d doi, --doi doi
Search for author affiliations for a single DOI
-f file, --infile file
File with a list of Pubmed IDs and DOIs (they can be mixed). One entry per line.
-x format, --format format
Output format. Choices=[‘json’ (default),’text’]. ‘text’ option produces tab separated table, denormalised in the sense that the pubmed ID/DOI is repeated on multiple rows if there are multiple authors with related affiliations.

示例运行

pubmed输入和json输出:

python pubmedAuthorAffiliation.py -i 27863242

输出:

{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}
{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}

DOI输入和文本输出:

python pubmedAuthorAffiliation.py -d 10.1016/j.molcel.2016.11.013 -x text

输出:

27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     L       Tafur   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     Y       Sadian  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     NA      Hoffmann        European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     AJ      Jakobi  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany; European Molecular Biology Laboratory (EMBL), Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany.     Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     R       Wetzel  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     WJH     Hagen   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     C       Sachse  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     CW      Müller  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany. Electronic address: cmueller@embl.de.    Germany European Molecular Biology Laboratory (EMBL)

混合了doi和pubmed的输入文件。写入文件的文本输出:

python pubmedAuthorAffiliation.py -f emdb-2010.txt -x text > /tmp/out.txt

在这种情况下,将忽略无法识别的行,例如:

WARNING:root:processList: id not recognized: id

代码测试

这将遍历选定的pubmed和已知工作的doi列表:

python test_pubmedAuthorAffiliation.py

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java JPA。Eclipselink没有为mySQL提供密码,但它应该提供   我的Servlet和@FormDataParam存在java问题   java将什么作为上下文参数传递到文件I/O方法中?   如果两个值相同,java无法找到其中一个单选按钮   java在变量和方法名中使用下划线   JavaSpringMVC单线程安全?   klazz类的java Arraylist(反射Api)   java如何在数字字符串中查找最频繁的数字?   JavaAPI设计:使数据更易于阅读与强制更多API调用   JavaHadoopMapReduceforGoogleWebGraph   java无法启动gauge API:Runner意外退出   java如何在bluemix上使用ibm工作负载调度器?   拉取一年中某一周特定日期的所有日期   java为什么是我的角节点。js应用程序将图像上传到S3� 邮递员正确上传时的符号?   在不使用任何第三方jar的情况下将文件从本地传输到linux系统(java代码)   java将现有文件夹复制到Eclipse工作区中新创建的项目中   Java中的regex RegExp帮助   当使用“系统”外观时,Java组合框setSelectedItem会出现故障   JavaASM:在类的方法中获取局部变量名和值