在网站中插入文本并删除生成的请求

2024-04-25 08:36:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我会尽可能详细地解释我的问题。我甚至不知道正确的术语。因此,我无法准确地寻找我的问题。 我想使用这个website来计算一个句子的音节数,并使用Python的和库将每个句子生成的音节数刮到一个.txt文件中。下面是我想一步一步做的:

1-打开此URL: https://www.howmanysyllables.com/syllable_counter/

2-在音节计数器字段内输入一些句子

3-单击字段下的“计数音节”按钮

4-将生成的数字刮到左侧

我可以通过从网站上抓取数字来完成步骤4。我的奋斗目标是第2步和第3步。我能够使用一个定义的函数来计算代码中的音节,但结果与网站中的结果不匹配

我希望我已经尽可能清楚地解释了这个问题


Tags: 文件httpstxtcomurl网站www数字
2条回答

此脚本将sentence作为POST请求中的参数发送到页面,并以文本形式获取结果:

import requests
from bs4 import BeautifulSoup

sentence = 'I will try to explain my problem as detailed as possible.'

url = 'https://www.howmanysyllables.com/syllable_counter/'
soup = BeautifulSoup( requests.get(url).content, 'html.parser' )

payload = {}
for i in soup.select('form[action="/syllable_counter/"] input[value]'):
    payload[i['name']] = i['value']
payload['UQ_txt'] = sentence

soup = BeautifulSoup( requests.post(url, data=payload).text, 'html.parser' )
for a in soup.select('#foot_M .Answer_Red'):
    print('{}{}'.format(a.text, a.find_next_sibling(text=True)))

印刷品:

11 words
16 syllables
57 characters

编辑:要发送多行,可以使用以下示例:

import requests
from bs4 import BeautifulSoup

sentences = [
    'This is line one.'
    'This is line two.'
    'This is line three.'
]

url = 'https://www.howmanysyllables.com/syllable_counter/'
soup = BeautifulSoup( requests.get(url).content, 'html.parser' )

payload = {}
for i in soup.select('form[action="/syllable_counter/"] input[value]'):
    payload[i['name']] = i['value']
payload['UQ_txt'] = '\n'.join(sentences)

soup = BeautifulSoup( requests.post(url, data=payload).text, 'html.parser' )
for a in soup.select('#foot_M .Answer_Red'):
    print('{}{}'.format(a.text, a.find_next_sibling(text=True)))

印刷品:

12 words
12 syllables
53 characters

这里的例子是Selenium

它可能需要驱动程序来控制web浏览器FirefoxChrome

代码注释中描述的每一行代码

from selenium import webdriver
  
url = 'https://www.howmanysyllables.com/syllable_counter/'

# open browser
driver = webdriver.Firefox()

# load page
driver.get(url)

# find field 
item = driver.find_element_by_id('syl_input')

# put text
item.send_keys('Hello World')

# find button 
item = driver.find_element_by_id('button_submit')

# click button
item.click()

# find all red numbers 
all_answers = driver.find_elements_by_class_name('Answer_Red')
#for answer in all_answers:
#    print(answer.text)

# display numbers
print('words:', all_answers[0].text)
print('syllables:', all_answers[1].text)
print('characters:', all_answers[2].text)

顺便说一句:有时使用Selenium编写它更容易,但是使用requests的版本(换句话说)应该工作得更快

相关问题 更多 >