我正在尝试检测“google retclecha”页面,同时使用selenium抓取google搜索结果。我写的一些刮码
def spider(search_term, intext_term, include_term, target_site):
driver = open_webdriver()
driver.implicitly_wait(10)
num_records_scraped = 0
for page in range(0, Max_Page, 10):
search_url = target_url(search_term, intext_term, include_term, target_site, page)
driver.get(search_url)
items = select_wholePage(driver)
for item in items:
record = get_result(item)
if record:
records.append(record)
num_records_scraped += 1
time_interval()
driver.quit()
页面开始=0,增加10,移动到下一个10页通常显示为“移动验证码页面”->;对于范围内的页面(0,最大页面,10)。 验证验证码页面是否包含元素ID为“recaptcha令牌”。所以我会用这个
recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))
就这样试着
for page in range(0, Max_Page, 10):
search_url = target_url(search_term, intext_term, include_term, target_site, page)
driver.get(search_url)
recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))
if recaptcha :
print('This is recaptcha')
else:
items = select_wholePage(driver)
for item in items:
record = get_result(item)
if record:
records.append(record)
num_records_scraped += 1
time_interval()
driver.quit()
但它有超时错误
recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token"))) in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:
我认为我的逻辑在检测捕获ID或其他方面有问题。请帮帮我
目前没有回答
相关问题 更多 >
编程相关推荐