BeautifulSoup:“NoneType”错误,该错误会不时引发

2024-06-16 09:34:16 发布

您现在位置:Python中文网/ 问答频道 /正文

import requests
from bs4 import BeautifulSoup as bs


tagsnumber = 0
even = 0
descdict_counter = -1
linkdict = {}

w3schools = requests.get('http://www.w3schools.com/tags/default.asp').text
table = bs(w3schools, "lxml").tbody
tdlist = table('td')  # to find the descriptions
alist = table('a')    # to get all the links
for link in alist:
    descdict_counter += 2  # to extract all the even td for decsriptions
    fulllink = str('http://www.w3schools.com/tags/' + link.get('href'))
    shortdesc = str(tdlist[descdict_counter].string)
    key_iter = {str(link.string): fulllink}
    linkdict.update(key_iter)
    tagsnumber += 1
print('Total tags imported: ' + str(tagsnumber))
print(linkdict)

伙计们,救命啊。我真的不明白。问题是,我得到相当流行的«非类型»错误,但…不总是。有时这段代码真的会产生结果。怎么可能?错误:

Traceback (most recent call last):
  File "main.py", line 21, in <module>
    tdlist = table('td')
TypeError: 'NoneType' object is not callable

如果可以的话,也请评论愚蠢的/多余的代码。你知道吗


Tags: thetoimportgetcountertagstablelink
1条回答
网友
1楼 · 发布于 2024-06-16 09:34:16

我同意@DYZ所说的。除此之外,您可能对BeautifulSoup的另一种替代方法感兴趣,它有时通过xpath表达式提供对更简单解决方案的访问。是lxml。你知道吗

>>> import requests
>>> w3schools = requests.get('http://www.w3schools.com/tags/default.asp').text
>>> from lxml import html
>>> tree = html.fromstring(w3schools)
>>> links = tree.xpath('//table[@class="w3-table-all notranslate"]//a')
>>> len(links)
119
>>> descrips = tree.xpath('//table[@class="w3-table-all notranslate"]//td[2]')
>>> len(descrips)
119
>>> links[0].attrib
{'href': 'tag_comment.asp'}
>>> descrips[0].text
'Defines a comment'

编辑:差点忘了:您的代码取决于tbody标记的存在。浏览器将很高兴地显示包含缺少此标记的表的页面。因此,即使在没有什么借口的今天,它也常常被忽略。但如果我没弄错的话,它的缺失会让你的代码嘎吱作响。你知道吗

相关问题 更多 >