我想从rottentomatoes上刮下一页
页面截图为
如图所示span class= descriptor
是a class
的父类div class = info director
是Directed By
的父类
我想刮去董事们的名字
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36", "Accept-Encoding":"gzip, deflate", "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "DNT":"1","Connection":"close", "Upgrade-Insecure-Requests":"1"}
url= 'https://editorial.rottentomatoes.com/guide/best-sci-fi-movies-of-all-time/'
r = requests.get(url, headers=headers)#, proxies=proxies)
content = r.content
soup = BeautifulSoup(content)
director = []
people1 = soup.find_all('div',{'class':'info director'})
for d in people1:
Dir = d.find('a').text
director.append(Dir)
我犯了这个错误
AttributeError: 'NoneType' object has no attribute 'text'
使用“info director”类将div作为目标,并使用一行程序将所有href文本转储到列表中
相关问题 更多 >
编程相关推荐