在Python中用selenium从<li>项中提取文本

2024-03-28 20:29:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在嵌套ulli结构中获取a/a标记中的文本。我找到了所有的“li”,但无法在a中找到文本

我将python3.7和seleniumwebdriver与Firefox驱动程序一起使用。你知道吗

相应的HTML是:

[some HTML]

<ul class="dropdown-menu inner">
<!---->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option first-in-group group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 1</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 2</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT 3</a>
    </li>
    <!-- end nyaBsOption: curso in ctrl.cursos group by curso.grupo -->
    <li nya-bs-option="curso in ctrl.cursos group by curso.grupo" class="nya-bs-option group-item">
        <span class="dropdown-header">Cursos em Destaque</span>
        <a tabindex="0">Important TEXT4</a>
    </li>
                            [another 100 <li></li> similar blocks]                  .
                                                .
    <li class="no-search-result" placeholder="Curso">
        <span>Unimportant TEXT</span>
    </li>
</ul>

[more HTML]

我尝试了以下代码:

cursos = browser.find_elements_by_xpath('//li[@nya-bs-option="curso in ctrl.cursos group by curso.grupo"]')
nome_curso = [curso.find_element_by_tag_name('a').text for curso in cursos]

我得到的列表项目数正确,但全部=“”。有人能帮我吗?厚度。你知道吗


Tags: inbybsgroupliitemclassoption
1条回答
网友
1楼 · 发布于 2024-03-28 20:29:25

好像你很亲近。要提取文本,例如,重要文本1重要文本2重要文本3重要文本4等,您必须为所需的visibility_of_all_elements_located()导出WebDriverWait,并且可以使用以下Locator Strategies

  • 使用CSS_SELECTORget_attribute()方法:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "ul.dropdown-menu.inner li.nya-bs-option a")))])
    
  • 使用XPATHtext属性:

    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//ul[@class='dropdown-menu inner']//li[contains(@class, 'nya-bs-option')]//a")))])
    
  • 注意:必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the title attribute through Selenium using Python?


奥特罗

根据文件:

相关问题 更多 >