如何使用python检测google recaptcha?

2024-04-18 13:23:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试检测“google retclecha”页面,同时使用selenium抓取google搜索结果。我写的一些刮码

def spider(search_term, intext_term, include_term, target_site):
    driver = open_webdriver()
    driver.implicitly_wait(10)
    num_records_scraped = 0

    for page in range(0, Max_Page, 10):
        search_url = target_url(search_term, intext_term, include_term, target_site, page)
        driver.get(search_url)
        items = select_wholePage(driver)
        for item in items:
            record = get_result(item)
            if record:
                records.append(record)
                num_records_scraped += 1
        time_interval()

    driver.quit()

页面开始=0,增加10,移动到下一个10页通常显示为“移动验证码页面”->;对于范围内的页面(0,最大页面,10)。 验证验证码页面是否包含元素ID为“recaptcha令牌”。所以我会用这个

recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))

就这样试着

    for page in range(0, Max_Page, 10):
        search_url = target_url(search_term, intext_term, include_term, target_site, page)
        driver.get(search_url)
        recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))
        if recaptcha :
            print('This is recaptcha')
        else:
            items = select_wholePage(driver)
            for item in items:
                record = get_result(item)
                if record:
                    records.append(record)
                    num_records_scraped += 1
            time_interval()

    driver.quit()

但它有超时错误

recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token"))) in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

我认为我的逻辑在检测捕获ID或其他方面有问题。请帮帮我


Tags: inidurltargetforsearchgetdriver