有什么问题?我该怎么办?我想做一个网页爬行工具。 我在用美声来接你。你知道吗
def get_page(url):
try:
import requests
import bs4
import lxml
res = requests.get(url)
soup = bs4.BeautifulSoup(res.content, "lxml")
return soup
except:
return ""
def get_all_target(page):
list = []
for elem in get_page(page).select("a"):
list.append(elem.get("href"))
return list
def union(p, q):
for e in q:
if e not in p:
p.append(e)
def crawl_web(seed):
tocrawl = [seed]
crawled = []
while tocrawl:
page = tocrawl.pop()
if page not in crawled:
union(tocrawl, get_all_target(get_page(page)))
crawled.append(page)
return crawled
以下是我遇到的错误:
File"<stdin>", line 1, in <module>
File "<stdin>", line 9, in crawl_web
File "<stdin>", line 3, in get_all_target
AttributeError: 'str' object has no attribute 'select'
在BeautifulSoup中使用“select”方法是否有错误?你知道吗
目前没有回答
相关问题 更多 >
编程相关推荐