你好社区我有一个问题,我不知道如何解决它我的问题是我写了一个脚本,以抓取网页的图像与BeautifuleSoup4,但我得到了错误(AttributeError:'NoneType'对象没有属性'组')
import re
import requests
from bs4 import BeautifulSoup
site = 'https://www.fotocommunity.de/natur/wolken/3144?sort=new'
response = requests.get(site)
soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img', {"src": True})
urls = [img["src"] for img in img_tags]
for url in urls:
filename = re.search(r'([\w_-]+[.](jpg|png))$', url)
with open(filename.group(1), 'wb') as f:
if 'http' not in url:
# sometimes an image source can be relative
# if it is provide the base url which also happens
# to be the site variable atm.
url = '{}{}'.format(site, url)
response = requests.get(url)
f.write(response.content)
你的正则表达式错了。如果您不熟悉正则表达式,可以使用Python的内部
urllib
来执行重量级的提升,而不是编写正则表达式。你知道吗使用这样的方法(未经测试):
相关问题 更多 >
编程相关推荐