由于某些原因,无法从p标记获取文本Selenium(Python)

2024-03-28 21:51:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用硒擦掉一页。示例HTML如下所示(来自查看页面源代码)

<div class="col s12 m12 l4 xl4 therapist_contact_list">
<p class="col s6 m6 l6 xl6 noPaddingLeft lprofile-address hide-on-large-only"><i class="fa fa-map-marker" aria-hidden="true"></i> Birmingham, Alabama 35294</p>
<p class="col s12 read_content_par hide-on-large-only noPadding">Online Video & phone session only- etc
 </p>
</div>

所以我的selenium代码是

location = listing.find_element_by_xpath('.//div[2]/p[1]').text
description = listing.find_element_by_xpath('.///div[2]/p[2]').text.replace(",","")

它来自for循环,因此XPath是正确的。其他我需要刮的东西都能用,但这两个都是空的。我不知道为什么

我不知道这是否意味着什么,但其他标记不是p标记,也没有引号

like so

and here

编辑:页面为https://www.goodtherapy.org/therapists/al/birmingham


Tags: divonlybyoncol页面elementfind
3条回答

您可以使用BeautifulSoupRequests来代替selenium。这是因为使用BeautifulSoupRequests将减少大约10-20秒的执行时间。这就是你如何做到的:

from bs4 import BeautifulSoup
import requests

r = requests.get('https://www.goodtherapy.org/therapists/al/birmingham').text

soup = BeautifulSoup(r,'html5lib')

p_tags1 = soup.find_all('p',class_ = "col s6 m6 l6 xl6 noPaddingLeft lprofile-address hide-on-large-only")
p_tags2 = soup.find_all('p', class_ = "col s12 read_content_par hide-on-large-only noPadding")

for p in p_tags1:
    print(p.text.strip())

print("\n")

for p in p_tags2:
    print(p.text.strip())

输出:

Birmingham, Alabama 35294
Birmingham, Alabama 35209
Birmingham, Alabama 35209
Birmingham, Alabama 35209
Birmingham, Alabama 35223
Mountain Brook, Alabama 35223
Birmingham, Alabama 35223
Homewood, Alabama 35209
Hoover, Alabama 35226
Birmingham, Alabama 35216

Online Video & phone session only- Change begins with the first step. As a tele-mental health therapist I seek to walk on this journey with you, collaborate and provide a space where change is nurtured and encouraged. There will be tough moments, but with my warm and compassionate personality, I will encourage you to take the next step. I will pro
As a counselor, I believe meeting an individual where they are is a vital part of providing best practice. With over seven years of experience assisting individuals, families, child/adolescents, and teens with a wide range of socials/personal relationships, emotional/trauma, behavioral/crisis, substance abuse, mental health issues, and many other c
Thank you for visiting Love Out Loud Counseling and Consulting Services. My name is Stephanie Lett. As a Licensed Independent Clinical Social Worker, I have dedicated nearly a decade to serving others in order to improve society. My profession is much more than a tool for me to earn a livelihood-it is my passion. With an extensive background in c
The focus of my practice, is to assist adults, children, and families identify the root, of their concerns. I assess situations in your life which attribute to current emotions, behaviors, and thought patterns. With child and adolescent clients my focus is similar, with the addition of close communication with the family. Moreover, with child and a
Counseling is much more than problem solving. It's a place where you can celebrate your strengths and build on them to help prepare for life's challenges, big and small. I believe that first and foremost, successful counseling is built on trust and connection. Many people who struggle are reluctant to reach out. If you are seeking help you&
I enjoy helping people understand why they do what they do or feel how they feel. I believe everyone that walks thru my door wants to feel better and together we can make that happen. My first priority is to provide a safe/non-judgmental environment. I take time getting to know what experiences have brought a person to my office. I provide init
Taking the first step towards a better tomorrow for yourself can be difficult. Through your counseling journey, I want to help you feel safe & empowered as you work through your current roadblocks, both emotional & behavioral. I currently provide counseling sessions to both individuals and couples. I have training in mood & behavioral disorders, ad
I am a Licensed Professional Counselor in the state of Alabama with over 10 years experience working as a counselor, mentor, and life coach. I am licensed in Delaware as an LMHC and Georgia as an LPC. I provide in-office counseling with clients in the Birmingham, AL area and telehealth services for clients in Alabama, Delaware, and Georgia.
I
My approach to therapy is an integrative approach to wellness focusing on the whole person; Mind, Body, and Spirit. I utilize a person-centered and strength-based approach to help clients become empowered, to help them identify their strengths and help them utilize these strengths to succeed in life.
I understand from personal experience that a
Creating a place in which my clients feel safe is important to me. Video sessions are a great choice for many clients. Other choices I offer are "Walk and Talk" sessions and in-office sessions. Being in nature can be very helpful for some people, so "Walk and Talk" sessions are very popular. My office is set up to be a calming place

@KunduK是正确的。可以使用element.get_attribute("textContent")获取元素的内部文本

当浏览页面时,我能够循环浏览每个治疗师,收集他们的信息并将其放入一个类object

治疗师数据类

class TherapistData:
    name = ""
    title = ""
    skills = ""
    profileUrl = ""
    isVerified = False
    verifiedCredentialsText = ""
    readContentPar = ""
    contactProfileAddress = ""
    contactReadContentPar = ""

我创建了一个名为get_number_of_therapists的方法,该方法返回页面上显示的治疗师人数。然后,我为每个治疗师收集数据,并将其放入我的课堂object

def scroll_to_element(self, xpath : str):
        element = self.chrome_driver.find_element(By.XPATH, xpath)
        self.chrome_driver.execute_script("return arguments[0].scrollIntoView();", element)


def get_number_of_therapists(self):
        return self.chrome_driver.find_elements(By.XPATH, "//ul[@class='therapist-list']//li").__len__()


def get_therapist_record_by_index(self, index : int):
        # Scroll to our record
        xpath = "//ul[@class='therapist-list']//li[{0}]".format(index)
        self.scroll_to_element(xpath)
        
        # Scrape our data
        therapist = TherapistData()
        therapist.profileUrl = self.chrome_driver.find_element(By.XPATH, "({0}//a[contains(@href, 'profile')])[1]".format(xpath)).get_attribute("href")
        therapist.name = self.chrome_driver.find_element(By.XPATH, "{0}//div[contains(@class, 'therapist_middle_section')]//h2".format(xpath)).text
        therapist.title = self.chrome_driver.find_element(By.XPATH, "{0}//div[contains(@class, 'therapist_middle_section')]//h3".format(xpath)).text
        therapist.skills = self.chrome_driver.find_element(By.XPATH, "{0}//div[contains(@class, 'therapist_middle_section')]//h4".format(xpath)).text
        
        therapist.verifiedCredentialsText = self.chrome_driver.find_element(
            By.XPATH,"{0}//div[contains(@class, 'therapist_middle_section')]//p[contains(@class, 'verified-credentials')]".format(xpath)).get_attribute("textContent").strip()
        
        therapist.isVerified = True if therapist.verifiedCredentialsText.find("Verified") != -1 else False
        
        therapist.readContentPar = self.chrome_driver.find_element(
            By.XPATH,"{0}//div[contains(@class, 'therapist_middle_section')]//p[contains(@class, 'read_content_par')]".format(xpath)).get_attribute("textContent").strip()
        
        therapist.contactProfileAddress = self.chrome_driver.find_element(
            By.XPATH, "{0}//div[contains(@class, 'therapist_contact_list')]//p[contains(@class, 'profile-address')]".format(xpath)).get_attribute("textContent").strip()
        
        therapist.contactReadContentPar = self.chrome_driver.find_element(
            By.XPATH, "{0}//div[contains(@class, 'therapist_contact_list')]//p[contains(@class, 'read_content_par')]".format(xpath)).get_attribute("textContent").strip()
        
        return therapist

您正在查看的元素在页面上不可见。这就是您无法使用.text获取值的原因

而是使用element.get_attribute("textContent")

要处理动态页面,请诱导WebDriverWait()并等待presence_of_all_elements_located()和迭代

listing=WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"div.therapist_contact_list")))
for ele in listing:
    print(ele.find_element_by_xpath("./p[1]").get_attribute("textContent"))
    print(ele.find_element_by_xpath("./p[2]").get_attribute("textContent"))

您需要导入以下库

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

控制台输出:

Birmingham, Alabama 35294
Online Video & phone session only- Change begins with the first step. As a tele-mental health therapist I seek to walk on this journey with you, collaborate and provide a space where change is nurtured and encouraged. There will be tough moments, but with my warm and compassionate personality, I will encourage you to take the next step. I will pro

 Birmingham, Alabama 35209
As a counselor, I believe meeting an individual where they are is a vital part of providing best practice. With over seven years of experience assisting individuals, families, child/adolescents, and teens with a wide range of socials/personal relationships, emotional/trauma, behavioral/crisis, substance abuse, mental health issues, and many other c

 Birmingham, Alabama 35209
Thank you for visiting Love Out Loud Counseling and Consulting Services. My name is Stephanie Lett. As a Licensed Independent Clinical Social Worker, I have dedicated nearly a decade to serving others in order to improve society. My profession is much more than a tool for me to earn a livelihood-it is my passion. With an extensive background in c

 Birmingham, Alabama 35209
The focus of my practice, is to assist adults, children, and families identify the root, of their concerns. I assess situations in your life which attribute to current emotions, behaviors, and thought patterns. With child and adolescent clients my focus is similar, with the addition of close communication with the family. Moreover, with child and a

 Birmingham, Alabama 35223
Counseling is much more than problem solving. It's a place where you can celebrate your strengths and build on them to help prepare for life's challenges, big and small. I believe that first and foremost, successful counseling is built on trust and connection. Many people who struggle are reluctant to reach out. If you are seeking help you&

 Mountain Brook, Alabama 35223
I enjoy helping people understand why they do what they do or feel how they feel. I believe everyone that walks thru my door wants to feel better and together we can make that happen. My first priority is to provide a safe/non-judgmental environment. I take time getting to know what experiences have brought a person to my office. I provide init

 Birmingham, Alabama 35223
Taking the first step towards a better tomorrow for yourself can be difficult. Through your counseling journey, I want to help you feel safe & empowered as you work through your current roadblocks, both emotional & behavioral. I currently provide counseling sessions to both individuals and couples. I have training in mood & behavioral disorders, ad

 Homewood, Alabama 35209
I am a Licensed Professional Counselor in the state of Alabama with over 10 years experience working as a counselor, mentor, and life coach. I am licensed in Delaware as an LMHC and Georgia as an LPC. I provide in-office counseling with clients in the Birmingham, AL area and telehealth services for clients in Alabama, Delaware, and Georgia.
I

 Hoover, Alabama 35226
My approach to therapy is an integrative approach to wellness focusing on the whole person; Mind, Body, and Spirit. I utilize a person-centered and strength-based approach to help clients become empowered, to help them identify their strengths and help them utilize these strengths to succeed in life.
I understand from personal experience that a

 Birmingham, Alabama 35216
Creating a place in which my clients feel safe is important to me. Video sessions are a great choice for many clients. Other choices I offer are "Walk and Talk" sessions and in-office sessions. Being in nature can be very helpful for some people, so "Walk and Talk" sessions are very popular. My office is set up to be a calming place

相关问题 更多 >