用BeautifulSoup(或者更确切地说是xpath)解析span类

2024-04-16 22:40:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有:

try:
    page = requests.get(Scrape.site_to_scrape['git']+gitUser)
    tree = urllib.urlopen(page).read()
    soup = BS(response)
    parse_git_full_name = soup.find("span", {"class":"vcard-fullname"}).get_text()
    return parse_git_full_name

except:
    print "Syntax: python site_scrape.py -g <git user name here>"

但是,它一直落入except:

我试图分析一个元素,比如:

<span class="vcard-fullname" itemprop="name">The name</span>

我试图获取<span>标记之间的值


Tags: namegitgetparsepagesitefullclass
1条回答
网友
1楼 · 发布于 2024-04-16 22:40:32

而是使用带有单个选择器的xpath来解决这个问题。希望这能帮助其他人通过beautifulsoup选择器把头发拔出来。你知道吗

try:
    page = requests.get(Scrape.site_to_scrape['git']+gitUser)
    tree = html.fromstring(page.text)

    full_name = tree.xpath('//span[@class="vcard-fullname"]/text()')

    print 'Full Name: ', full_name

except:
    print "Syntax: python site_scrape.py -g <git user name here>"

相关问题 更多 >