嗨,我的代码在实际联机运行时不起作用,当我使用Find
时它返回None
如何修复这个问题?你知道吗
这是我的密码
import time
import sys
import urllib
import re
from bs4 import BeautifulSoup, NavigableString
print "Initializing Python Script"
print "The passed arguments are "
urls = ["http://tweakers.net/pricewatch/355474/gigabyte-gv-n78toc-3g/specificaties/", "http://tweakers.net/pricewatch/328943/sapphire-radeon-hd-7950-3gb-gddr5-with-boosts/specificaties/", "https://www.alternate.nl/GIGABYTE/GV-N78TOC-3GD-grafische-kaart/html/product/1115798", "http://tweakers.net/pricewatch/320116/raspberry-pi-model-b-(512mb)/specificaties/"]
i =0
regex = '<title>(.+?)</title>'
pattern = re.compile(regex)
word = "tweakers"
alternate = "alternate"
while i<len(urls):
dataraw = urllib.urlopen(urls[i])
data = dataraw.read()
soup = BeautifulSoup(data)
table = soup.find("table", {"class" : "spec-detail"})
print table
i+=1
结果如下:
Initializing Python Script
The passed arguments are
None
None
None
None
Script finalized
我试过用芬德尔和其他方法。。但我似乎不明白为什么它是在我的命令行上工作,而不是在服务器本身。。。 有什么帮助吗?你知道吗
编辑
Traceback (most recent call last):
File "python_script.py", line 35, in
soup = BeautifulSoup(urllib2.urlopen(url), 'html.parser')
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
我怀疑你正在经历differences between parsers。你知道吗
显式指定解析器对我有效:
在本例中,我使用的是
html.parser
,但是您可以随意使用并指定lxml
或html5lib
。你知道吗注意,第三个url不包含带有
class="spec-detail"
的table
,因此,它为它打印None
。你知道吗我还介绍了一些改进:
urllib
替换为urllib2
您还可以使用^{} 模块并设置适当的
User-Agent
头,假装是真正的浏览器:希望有帮助。你知道吗
相关问题 更多 >
编程相关推荐