Python:Mechanize和BeautifulSoup在共享主机上无法使用
我正在写一个小工具,目的是让我的本地机场网站能使用标准的HTML格式。
在我自己的电脑上,我使用Python的mechanize
和BeautifulSoup
这两个包来抓取和解析网站内容,一切看起来都很顺利。我是通过apt-get
安装这些包的。
在我的共享主机上(在DreamHost),我下载了.tar.gz
文件,解压了这些包,重命名了文件夹(比如,把BeautifulSoup-3.1.0.tar.gz
改成BeautifulSoup
),然后尝试运行命令。
但是我遇到了一个奇怪的错误,跟BeautifulSoup有关。我不知道这是因为Dreamhost上Python的版本太旧,还是文件夹名称的问题,或者其他原因。
[sanjose]$ python
Python 2.4.4 (#2, Jan 24 2010, 11:50:13)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from BeautifulSoup import BeautifulSoup
>>> import mechanize
>>> url='http://www.iaa.gov.il/Rashat/he-IL/Airports/BenGurion/informationForTravelers/OnlineFlights.aspx?flightsType=arr'
>>> br=mechanize.Browser()
>>> br.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)')]
>>> r=br.open(url)
>>> html=r.read()
>>> type(html)
<type 'str'>
我这样做是为了证明输入确实是一个字符串。现在让我们运行在我本地电脑上能正常工作的命令:
>>> soup = BeautifulSoup.BeautifulSoup(html)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 1493, in __init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 1224, in __init__
self._feed(isHTML=isHTML)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 1257, in _feed
self.builder.feed(markup)
File "/usr/lib/python2.4/HTMLParser.py", line 108, in feed
self.goahead(0)
File "/usr/lib/python2.4/HTMLParser.py", line 148, in goahead
k = self.parse_starttag(i)
File "/usr/lib/python2.4/HTMLParser.py", line 268, in parse_starttag
self.handle_starttag(tag, attrs)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 1011, in handle_starttag
self.soup.unknown_starttag(name, attrs)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 1408, in unknown_starttag
tag = Tag(self, name, attrs, self.currentTag, self.previous)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 525, in __init__
self.attrs = map(convert, self.attrs)
File "/home/adamatan/matan.name/natbug/BeautifulSoup/BeautifulSoup.py", line 524, in <lambda>
val))
File "/usr/lib/python2.4/sre.py", line 142, in sub
return _compile(pattern, 0).sub(repl, string, count)
TypeError: expected string or buffer
有什么想法吗?
亚当
1 个回答
3
你正在使用的BeautifulSoup版本是3.1.0,这个版本是给Python 3.x用的。如果你在用Python 2.x的话,应该使用3.0版本的BeautifulSoup。