Selenium按标记名搜索选项

2024-04-29 19:54:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试从一个名为Correios的网站获取所有数据,在这个网站中,我需要处理一些下拉列表,我遇到了一些问题,如: 它返回一个包含一堆空字符串的列表。你知道吗

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")

optEstados = driver.find_elements_by_tag_name("option")

for valores in optEstados:
    print(valores.text.encode())

我从中得到的是:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''

如何删除空的b“”?你知道吗


Tags: path列表by网站driverelementsfindchrome
3条回答

如果我没记错的话,你想找到所有的选项。你知道吗

enter image description here

尝试使用以下xPath查找下拉元素:

//*[@id="estadoAgencia"]/option

代码示例:

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")

# find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")

for valores in optEstados:
    print(valores.text.encode())

通过这个xPath,您将获得所有下拉列表元素,除了一个在这个下拉列表中之外,没有空字符串。输出:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'

注意:第一个元素是空字符串,因为:

img2

要从下拉列表的所有<options>中检索文本,将id作为estadoAgencia,因为它是<select>标记,使用与<select>标记相关联的方法将更容易和有效,您可以使用以下解决方案:

  • 代码块:

    estado_select = Select(driver.find_element_by_id('estadoAgencia'))
    for opt in estado_select.options:
        print(opt.get_attribute('innerHTML'))
    
  • 控制台输出:

    ACRE
    ALAGOAS
    AMAPÁ
    AMAZONAS
    BAHIA
    CEARÁ
    DISTRITO FEDERAL
    ESPÍRITO SANTO
    GOIÁS
    MARANHÃO
    MINAS GERAIS
    MATO GROSSO DO SUL
    MATO GROSSO
    PARÁ
    PARAÍBA
    PERNAMBUCO
    PIAUÍ
    PARANÁ
    RIO DE JANEIRO
    RIO GRANDE DO NORTE
    RONDÔNIA
    RORAIMA
    RIO GRANDE DO SUL
    SANTA CATARINA
    SERGIPE
    SÃO PAULO
    TOCANTINS
    

您的代码中需要进行一个小的更改:

 dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
 optEstados = dropdownEstados.find_elements_by_tag_name("option")

  for valores in optEstados:
     print(valores.text.encode())

相关问题 更多 >