NameError: 'htmltext'未定义

0 投票

1 回答

2141 浏览

提问于 2025-04-29 18:56

我在运行这个脚本时遇到了一个错误：

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(htmltext)

原始脚本：

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(urls[0])
soup = BeautifulSoup(htmltext)

urls.pop(0)

print (soup.findAll('a',href=True))

错误信息：

socket.gaierror: [错误号 -2] 名称或服务未知

urllib.error.URLError: urlopen 错误 [错误号 -2] 名称或服务未知

追踪记录（最近的调用在最前面）：

NameError: 名称 'htmltext' 未定义

暂无标签

1 个回答

如果 urllib.request.urlopen() 出现错误，那么 htmltext 就不会被赋值（所以在 except 里打印这个值是没用的）。

至于为什么 urlopen() 不起作用，确保你传入的是一个有效的网址。

回答于 2025-04-29 由 Python大师

分享举报

NameError: 'htmltext'未定义

1 个回答

撰写回答