如果Troy Hunt发布了新的密码文件,我想查看Troy Hunt的网站“https://haveibeenpwned.com/Passwords”。为此,我阅读了网站,并希望搜索一个字符串,以获得文件的当前版本。它们总是以模式…v5.7z命名。v代表这个版本
# -*- coding: utf-8 -*-
import os
import urllib2
#from urllib2 import Request
from urllib2 import Request, urlopen, URLError, HTTPError
someurl='https://haveibeenpwned.com/Passwords'
req = Request(someurl, headers={'User-Agent': 'Mozilla/5.0'})
try:
response = urlopen(req)
except HTTPError as e:
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
except URLError as e:
print 'We failed to reach a server.'
print 'Reason: ', e.reason
else:
print "everything is fine"
response = urllib2.urlopen(req)
the_page = response.read()
print(the_page)
在“页面”中,是页面的整个HTML代码。我如何搜索它
我不允许使用beautifulsoap或解析器
目前没有回答
相关问题 更多 >
编程相关推荐