我需要格式化来自Selenium的文本,清除所有特殊字符

2024-06-07 11:01:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我想删除name中的所有特殊字符,但我做了一个脏代码。有人知道更好的方法来帮助我吗

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Set driver and time for wait in chrome
driver = webdriver.Chrome(executable_path=r"C:\Users\Gabri\anaconda3\chromedriver.exe")
wait: WebDriverWait = WebDriverWait(driver, 20)

# Open music link
driver.get('https://youtu.be/IJvV7qy0xQM')

# Set xpath of music name
Name_Music_xpath = '//*[@id="container"]/h1/yt-formatted-string'

# Waits until name appears
wait.until(EC.visibility_of_element_located((By.XPATH, Name_Music_xpath)))

# Save name
name = driver.find_element_by_xpath(Name_Music_xpath)

Original_name = name.text
Formatted_name = name.text

# I want to format music name but, i just know this way to make it
# I need to clear, removing everything that is not letter like ()  &  .  -
Formatted_name = Formatted_name.replace(')', '')
Formatted_name = Formatted_name.replace('(', '')
Formatted_name = Formatted_name.replace('&', '')
Formatted_name = Formatted_name.replace('.', '')
Formatted_name = Formatted_name.replace('-', '')

# Print the original name, and the format that i need
print(Original_name)
print(Formatted_name)

已解决代码:

import re
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


# Function to remove special characters
def remove_chars(input: str):
    cleaned_input = re.sub(r'[()&.-]', '', input)
    return " ".join(cleaned_input.split())


# Set driver and time for wait in chrome
driver = webdriver.Chrome(executable_path=r"C:\Users\Gabri\anaconda3\chromedriver.exe")
wait: WebDriverWait = WebDriverWait(driver, 20)

# Open music link
driver.get('https://youtu.be/IJvV7qy0xQM')

# Set xpath of music name
Name_Music_xpath = '//*[@id="container"]/h1/yt-formatted-string'

# Waits until name appears
wait.until(EC.visibility_of_element_located((By.XPATH, Name_Music_xpath)))

# Save name
name = driver.find_element_by_xpath(Name_Music_xpath)

Original_name = name.text
Formatted_name = remove_chars(name.text)

print(Original_name)
print(Formatted_name)

Tags: namefromimportbydriverseleniummusicxpath
1条回答
网友
1楼 · 发布于 2024-06-07 11:01:36

此函数将删除代码中使用的字符:

import re

def remove_chars(input: str):
    return re.sub(r'[()&.-]', '', input)

此函数使用Python中的^{}模块,用于辅助正则表达式。它返回没有()&.-字符的输入字符串

您可以在代码中使用此函数,如下所示:

Original_name = name.text
Formatted_name = remove_chars(name.text)

相关问题 更多 >

    热门问题