即使检测到所有元素,也不选择所有元素

2024-04-26 07:22:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个脚本,它四次单击this page底部的“showmore”以公开其他注释线程。你知道吗

即使我的XPATH将选择所有“See 1 more reply…/See N more reply…”元素,脚本也不会最终单击所有元素。(在撰写本文时,它只单击了13个元素中的7个。)

XPath选择器

//ui-view//a[contains(@class, "commentAction")]

脚本的部分(很长,如果你想/需要看更多的话,请告诉我):

tab_comments = browser.find_elements_by_xpath('//a[@gogo-test="comments_tab"]')

if len(tab_comments) > 0:

    browser.implicitly_wait(5)

    try:
        comments_count = int(''.join(filter(str.isdigit, str(tab_comments[0].text))))
    except ValueError:
        comments_count = 0

    if comments_count > 0:
        # 1. Switch to Comments Tab
        tab_comments[0].click()

        # 2. Expose Additional Threads
        show_more_comments = WebDriverWait(browser, 10).until(
            EC.element_to_be_clickable((By.XPATH, '//ui-view//a[text()="show more"]'))
        )

        clicks = 0
        while clicks <= 3:
            try:
                clicks += 1
                show_more_comments.click()
            except Exception:
                break

        # 3. Expand All Threads
        see_n_more_replies = browser.find_elements_by_xpath('//ui-view//a[contains(@class, "commentAction")]')
        for idx, see_replies in enumerate(see_n_more_replies):
            print('\n\n\n\nidx: ' + str(idx) + '\n\n\n\n')
            see_replies.click()

按钮需要在视图中才能单击吗?(其他的似乎不是这样,但此时我正抓住救命稻草。)

问题是,我在步骤# 4. ...中解析注释,由于它不会用一个以上的响应来扩展所有线程(这是它应该做的),所以这些字段在日志中显示为空。你知道吗

不会引发错误或异常。你知道吗

我正在使用Firefox/geckodriver。


Tags: browser脚本view元素uimoreshowcount
1条回答
网友
1楼 · 发布于 2024-04-26 07:22:05

执行以下代码段,通过单击showmore直到showmore链接消失,加载页面上的所有注释

comment_pages = 0
no_of_comments = len(driver.find_elements_by_tag_name('desktop-comment'))
while True:
    show_more_link = driver.find_elements_by_partial_link_text('show more')
    if len(show_more_link) == 0:  # if the 'show more' link does not exist on the page
        break
    # before clicking on the link, it is important to bring the link inside the viewport. Otherwise `ElementNotVisible` exception is encountered
    driver.execute_script('arguments[0].scrollIntoView(true);', show_more_link[0])
    show_more_link[0].click()
    try:
        # wait for more comments to load by waiting till the comment count after clicking the button is greater than before the click
        WebDriverWait(driver, 10, poll_frequency=2).until(lambda x: len(driver.find_elements_by_tag_name('desktop-comment')) > no_of_comments)
    except:
        break
    no_of_comments = len(driver.find_elements_by_tag_name('desktop-comment'))
    comment_pages += 1

执行此代码后,dom包含所有注释的内容。文章,你开始你的实际刮页。你知道吗

comments = driver.find_elements_by_tag_name('desktop-comment')
for comment in comments:
    author = comment.find_element_by_xpath(".//div[@class='commentLayout-header']/a[contains(@href, 'individuals')]").text
    print 'Comment by person : ' + author

    has_more_replies = len(comment.find_elements_by_partial_link_text("more replies...")) > 0
    if has_more_replies:
        more_replies = comment.find_element_by_partial_link_text("more replies...")
        driver.execute_script('arguments[0].scrollIntoView()', more_replies)
        more_replies.click()
    reply_count = len(comment.find_elements_by_xpath(".//div[contains(@class, 'commentLayout-reply')]"))
    print 'Number of replies to the comment : ' + str(reply_count)
    print '                                 -'

其输出如下:

Comment by person : Jeff Rudd
Number of replies to the comment : 1
                                 -
Comment by person : Martin Boyle
Number of replies to the comment : 1
                                 -
Comment by person : John Bickerton
Number of replies to the comment : 1
                                 -
Comment by person : Mikkel Taanning
Number of replies to the comment : 2
                                 -
Comment by person : Christopher Sams
Number of replies to the comment : 2
                                 -
Comment by person : Marc Vieux
Number of replies to the comment : 2
                                 -

........................

您可以修改for循环以获得注释的更多详细信息

相关问题 更多 >