from bs4 import BeautifulSoup
from urllib2 import urlopen
import urllib
# use this image scraper from the location that
#you want to save scraped images to
def make_soup(url):
html = urlopen(url).read()
return BeautifulSoup(html)
def get_images(url):
soup = make_soup(url)
#this makes a list of bs4 element tags
images = [img for img in soup.findAll('img')]
print (str(len(images)) + "images found.")
print 'Downloading images to current working directory.'
#compile our unicode list of image links
image_links = [each.get('src') for each in images]
for each in image_links:
filename=each.split('/')[-1]
urllib.urlretrieve(each, filename)
return image_links
#a standard call looks like this
#get_images('http://www.wookmark.com')
Python2
如果您只想将其另存为文件,下面是一个更简单的方法:
第二个参数是保存文件的本地路径。
Python3
正如SergO所建议的,下面的代码应该与Python 3一起工作。
file01.jpg
将包含您的图像。我写了a script that does just this,它可以在我的github上供您使用。
我利用美化组让我可以解析任何网站的图像。如果你要做大量的网页抓取(或打算使用我的工具),我建议你
sudo pip install BeautifulSoup
。有关美化组的信息可从here获得。为了方便起见,这里是我的代码:
相关问题 更多 >
编程相关推荐