通过Python从Pubmed获得的从属关系

2024-06-06 09:16:53 发布

男 | 程序猿一只，喜欢编程写python代码。

我正在编写一个Python脚本（从here修改并在下面报告）在publibmed上搜索某所大学的论文数量，并下载合作者的从属关系。如果我运行代码，而不是我得到的从属关系<Element 'Affiliation' at 0x106ea7e50>。你知道怎么做吗？我该怎么做才能下载所有作者的从属关系呢？谢谢！在

import urllib, urllib2, sys
import xml.etree.ElementTree as ET

def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

query = '(("University of Copenhagen"[Affiliation]))# AND ("1920"[Publication Date] : "1930"[Publication Date]))'

esearch = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&mindate=2001&maxdate=2010&retmode=xml&retmax=10000000&term=%s' % (query)
handle = urllib.urlopen(esearch)
data = handle.read()

root = ET.fromstring(data)
ids = [x.text for x in root.findall("IdList/Id")]
print 'Got %d articles' % (len(ids))

for group in chunker(ids, 100):
    efetch = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?&db=pubmed&retmode=xml&id=%s" % (','.join(group))
    handle = urllib.urlopen(efetch)
    data = handle.read()

    root = ET.fromstring(data)
    for article in root.findall("PubmedArticle"):
        pmid = article.find("MedlineCitation/PMID").text
        year = article.find("MedlineCitation/Article/Journal/JournalIssue/PubDate/Year")
        if year is None: year = 'NA'
        else: year = year.text
        aulist = article.findall("MedlineCitation/Article/AuthorList/Author")
        affiliation = article.find("MedlineCitation/Article/AuthorList/Author/Affiliation")
        print pmid, year, len(aulist), affiliation

Tags： in for data size article root xml urllib

1条回答

网友

1楼 · 发布于 2024-06-06 09:16:53

出现这种情况的原因是affiliation对象引用了一个XML元素，而不是一段文本。如果所需的字符串包含在值中，如下所示：

<affiliation>
    your_affiliation_text
</affiliation>

您需要打印affiliation.text。在

如果所需的字符串包含在属性中，如下所示：

^{pr2}$

您应该使用affiliation.attrib[name]。在

通过Python从Pubmed获得的从属关系

相关问题更多 >

编程相关推荐

热门问题

热门文章

通过Python从Pubmed获得的从属关系

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >