我想在从web检索数据时添加一个referer,但这对我的python2 refererrequest.add_header('Referer', 'https://www.python.org')
不起作用。你知道吗
我的Url.txt文件内容
https://www.python.org/about/
https://stackoverflow.com/questions
https://docs.python.org/2.7/
这是我的密码
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
import urllib2
import threading
import time
import requests
max_thread = 5
urllist = open("Url.txt").readlines()
def url_connect(url):
try :
request = urllib2.Request(url)
request.add_header('Referer', 'https://www.python.org')
request.add_header('User-agent', 'Mozilla/5.0')
goo = re.findall('<title>(.*?)</title>', urllib2.urlopen(url.replace(' ','')).read())[0]
print '\n' + goo.decode("utf-8")
with open('SaveMyDataFile.txt', 'ab') as f:
f.write(goo + "\n")
except Exception as Errors:
pass
for i in urllist:
i = i.strip()
if i.startswith("http"):
while threading.activeCount() >= max_thread:
time.sleep(0.1)
threading.Thread(target=url_connect, args=(i,)).start()
从https://docs.python.org/2/library/urllib2.html#urllib2.urlopen
您需要传递
urllib.urlopen()
您刚刚构建的Request
对象—您目前没有对它做任何操作。你知道吗在我看来问题出在你给urlopen打电话。你用url调用它,而不是请求。你知道吗
相关问题 更多 >
编程相关推荐