如何在HTML页面有shadow-root时使用selenium单击Google Chrome打印窗口中的保存或打印按钮?

0 投票
1 回答
60 浏览
提问于 2025-04-12 07:00

我想用Selenium和Python来抓取数据。我的Chrome版本是123.0.6312.87,和我的Web驱动程序是兼容的。我想从这个页面“https://web.bcpa.net/BcpaClient/#/Record-Search”抓取数据。在这个页面上,当我用Selenium输入地址“2216 NW 6 PL FORT LAUDERDALE”时,它会给我这个房产的详细信息。现在有一个打印按钮,当我用Selenium点击它时,它会把我重定向到一个新页面“https://web.bcpa.net/BcpaClient/recinfoprint.html”。在这个HTML页面中,有一个下拉选项,属于类“md-select”,我想选择“保存为PDF”,它的值是“Save as PDF/local/”。但是这个HTML页面有一个“影子根”,所以Selenium无法找到类“md-select”的位置。此外,我还想点击这个HTML页面下面的“保存”按钮,它在类“action-button”中,但由于有影子根,这造成了很大的问题。我尝试从“print-preview-app”中提取信息,它在影子根之前,但也没有成功。

代码:

import datetime as dt
import os
import time 
from datetime import datetime
import pandas as pd
import numpy as np
from datetime import timedelta
import sys
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.edge.options import Options
import os.path 
import json
import ssl
import io
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver import ActionChains
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains

try:
    # # Path to your Chrome WebDriver executable

    
    webdriver_path = "D:/Grass_Image_Classification/chromedriver-win64/chromedriver.exe"

    # Create a Chrome WebDriver instance
    service = Service(webdriver_path)
    driver = webdriver.Chrome(service=service)
    print('Successfully received the chromedriver path')

    driver.maximize_window()
    actions = ActionChains(driver)

    driver.get("https://web.bcpa.net/BcpaClient/#/Record-Search")


    driver.implicitly_wait(10)
    text_input = driver.find_element(By.XPATH, '//input[@class="form-control"]').send_keys("2216 NW 6 PL FORT LAUDERDALE, FL 33311")
    driver.implicitly_wait(10)
    search_button = driver.find_element(By.XPATH, '//span[@class="input-group-addon"]/span[@class="glyphicon glyphicon-search"]').click()
    
    driver.implicitly_wait(10)
    printer_click = driver.find_element(By.XPATH, '//div[@class="col-sm-1  btn-printrecinfo"]').click()
    driver.implicitly_wait(15)
  
    # Switch to the new tab

    handles = driver.window_handles
   

    print(handles)
    print(handles[-1])
    driver.switch_to.window(handles[-1])
    print(driver.current_url)
    
    # shadow_root =  driver.find_element(By.ID,"sidebar").shadow_root
    shadow_root =  driver.find_element(By.CSS_SELECTOR,"body > print-preview-app").shadow_root
    # shadow_root =  driver.find_element(By.XPATH,"/html/body/print-preview-app").shadow_root
    # shadow_root =  driver.find_element(By.XPATH,'//*[@id="sidebar"]').shadow_root
    shadow_text = shadow_root.find_element(By.CSS_SELECTOR,"print-preview-settings-section > div > select").text
    print(shadow_text)
    time.sleep(10)
    
except Exception as e:
    print(e)
    sys.exit(1)

我想在类“md-select”中选择“保存为PDF”,然后点击这个HTML页面下面的“保存”按钮,它在类“action-button”中。

在应用Web驱动程序的影子根方法后,代码就停止工作了。

1 个回答

2

你在新打开的标签页里不需要去访问 shadow-root。保存PDF文件其实更简单,可以使用Chrome驱动的一些选项。你只需要给Chrome驱动传递一些设置,这样在打印的时候,它就会自动把你的PDF文件保存到指定的文件夹里。

import json
import sys
import time

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.action_chains import ActionChains

try:
    print_settings = {
        "recentDestinations": [{
            "id": "Save as PDF",
            "origin": "local",
            "account": "",
        }],
        "selectedDestinationId": "Save as PDF",
        "version": 2,
        "isHeaderFooterEnabled": False,
        "isLandscapeEnabled": True
    }

    prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(print_settings),
             "download.prompt_for_download": False,
             "profile.default_content_setting_values.automatic_downloads": 1,
             "download.directory_upgrade": True,
             "savefile.default_directory": "/Users/a1/PycharmProjects/PythonProject", #this is path to dir where you want to save the file
             "safebrowsing.enabled": True}

    options = webdriver.ChromeOptions()
    options.add_experimental_option('prefs', prefs)
    options.add_argument('--kiosk-printing')
    service = Service()
    driver = webdriver.Chrome(options)

    driver.maximize_window()
    actions = ActionChains(driver)
    wait = WebDriverWait(driver, 20)

    driver.get("https://web.bcpa.net/BcpaClient/#/Record-Search")

    text_input = wait.until(EC.visibility_of_element_located((By.XPATH, '//input[@class="form-control"]'))).send_keys(
        "2216 NW 6 PL FORT LAUDERDALE, FL 33311")
    search_button = driver.find_element(By.XPATH,
                                        '//span[@class="input-group-addon"]/span[@class="glyphicon glyphicon-search"]').click()

    printer_click = wait.until(EC.visibility_of_element_located((By.XPATH, '//div[@class="col-sm-1  btn-printrecinfo"]'))).click()
    time.sleep(5)
except Exception as e:
    print(e)
    sys.exit(1)

撰写回答