Python urllib2+Beautifulsoup - 问答 - Python中文网

Python urllib2+Beautifulsoup

2024-04-18 14:36:20 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

所以我正在努力在我当前的python项目中实现beautiful，为了保持简单明了，我将降低当前脚本的复杂性。在

没有美丽的剧本-

import urllib2

    def check(self, name, proxy):
        urllib2.install_opener(
            urllib2.build_opener(
                urllib2.ProxyHandler({'http': 'http://%s' % proxy}),
                urllib2.HTTPHandler()
                )
            )

        req = urllib2.Request('http://example.com' ,"param=1")
        try:
            resp = urllib2.urlopen(req) 
        except:
            self.insert()
        try:
            if 'example text' in resp.read()
               print 'success'

当然，缩进是错误的，这只是我所做的事情的草图，简单地说，我将向example.com网站“&then如果example.com网站中包含“示例文本”响应读取打印成功。在

但我真正想要的是检查一下

^{pr2}$

然后输出文本在td内对齐自example.com网站请求使用

soup.find_all('td', {'align':'right'})[4]

现在我实现beauthulsoup的方式不起作用，这个例子-

import urllib2
from bs4 import BeautifulSoup as soup

main_div = soup.find_all('td', {'align':'right'})[4]

    def check(self, name, proxy):
        urllib2.install_opener(
            urllib2.build_opener(
                urllib2.ProxyHandler({'http': 'http://%s' % proxy}),
                urllib2.HTTPHandler()
                )
            )

        req = urllib2.Request('http://example.com' ,"param=1")
        try:
            resp = urllib2.urlopen(req) 
            web_soup = soup(urllib2.urlopen(req), 'html.parser')
        except:
            self.insert()
        try:
            if 'example text' in resp.read()
               print 'success' + main_div

现在你可以看到我添加了4个新的行/调整

from bs4 import BeautifulSoup as soup

web_soup = soup(urllib2.urlopen(url), 'html.parser')

main_div = soup.find_all('td', {'align':'right'})[4]

aswell as " + main_div " on print

但是它似乎不起作用，我在调整一些错误时遇到了一些错误，其中一些错误说“赋值前引用的局部变量”&；“unbound method find\u all必须以beautifulsGroup实例作为第一个参数进行调用”

Tags： import self com http example 错误 opener urllib2

1条回答

网友

1楼 · 发布于 2024-04-18 14:36:20

关于上一个代码片段：

from bs4 import BeautifulSoup as soup

web_soup = soup(urllib2.urlopen(url), 'html.parser')
main_div = soup.find_all('td', {'align':'right'})[4]

您应该在websoup实例上调用find_all。在使用url变量之前，请务必定义它：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章