Python 2不工作

2024-04-26 19:05:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我想在从web检索数据时添加一个referer,但这对我的python2 refererrequest.add_header('Referer', 'https://www.python.org')不起作用。你知道吗

我的Url.txt文件内容

https://www.python.org/about/
https://stackoverflow.com/questions
https://docs.python.org/2.7/

这是我的密码

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import re
import urllib2
import threading
import time
import requests

max_thread = 5
urllist = open("Url.txt").readlines()

def url_connect(url):
    try :
        request = urllib2.Request(url)
        request.add_header('Referer', 'https://www.python.org')
        request.add_header('User-agent', 'Mozilla/5.0')  
        goo = re.findall('<title>(.*?)</title>', urllib2.urlopen(url.replace(' ','')).read())[0]
        print '\n' + goo.decode("utf-8")
        with open('SaveMyDataFile.txt', 'ab') as f:
            f.write(goo + "\n")

    except Exception as Errors:
        pass

for i in urllist:
    i = i.strip()    

    if i.startswith("http"):        

        while threading.activeCount() >= max_thread:
            time.sleep(0.1)

        threading.Thread(target=url_connect, args=(i,)).start()

Tags: httpsorgimporttxtaddurlrequestwww
2条回答

https://docs.python.org/2/library/urllib2.html#urllib2.urlopen

Open the URL url, which can be either a string or a Request object.

您需要传递urllib.urlopen()您刚刚构建的Request对象—您目前没有对它做任何操作。你知道吗

在我看来问题出在你给urlopen打电话。你用url调用它,而不是请求。你知道吗

相关问题 更多 >