双回路故障

2024-04-29 11:34:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个循环我想集成到我的脚本,但不知道如何,因为我仍然在学习这一切。你知道吗

循环1)书籍作者的外部列表;将其从代码中删除。随着时间的推移,列表会越来越长,因此将其移到外部文件会很有帮助。如何引用一个列表并在代码中添加一个变量来遍历搜索中的每个作者?你知道吗

循环2)在这个阶段,我必须为每个作者重复我的代码。我怎么能写一次,然后在搜索下一个作者时让它重复呢。你知道吗

*这里的最终目标是以HTML格式导出搜索,以便轻松阅读。你知道吗

谢谢!你知道吗

from bs4 import BeautifulSoup
import urllib.request
import time

#Loop 1
var1 = 'Stephen%20King'
var2 = 'J.%20K.%20Rowling'
var3 = 'James%20Patterson'
var4 = 'John%20Grisham'

timestr = time.strftime("%m-%d-%Y")

#Loop 2
file = open('/var/script/exp/exp_' + timestr + '.html', 'a+')

with open('/var/script/exp/exp_' + timestr + '.html', 'a') as file_1, open('/var/script/src/header.html', 'r') as file_2:
    for line in file_2:
         file_1.write(line)

with open('/var/script/exp/exp_' + timestr + '.html', 'a') as file_1, open('/var/script/src/subheader.html', 'r') as file_2:
    for line in file_2:
         file_1.write(line)

for i in range(5):
    url = 'https://www.example.com/Listings?st=' + var1 + '&sg=&c=&s=&lp=0&hp=999999&p={}'.format(i)
    source = urllib.request.urlopen(url)
    soup = BeautifulSoup(source, 'html.parser')

    for products in soup.find_all('li', class_='widget'):
        image = products.find('img', class_='lazy-load')
        itemurl = products.find('a', class_='product')
        title = products.find('div', class_='title').text
        countdown = products.find(class_='product-countdown')
        price = products.find(class_='product-price').find(class_="price").text
        file = open('/var/script/exp/exp_' + timestr + '.html', 'a+')
        file.write('<div class="col-md-15 col-xs-3">')
        file.write('<div class="card mb-4 box-shadow">')
        file.write('<img class="card-img-top" src="')
        file.write(image.get('data-src'))
        file.write('" alt="Card image cap" height="200px">')
        file.write('<div class="card-body">')
        file.write('<div><p class="card-text"><a href="https://www.example.com' + itemurl.get('href')+'" target="_blank">' +  title + '</a>' + price +  '</p>')
        file.write('<div class="d-flex justify-content-between align-items-center">')
        file.write('<div class="btn-group">')
        file.write('<button type="button" class="btn btn-sm btn-outline-secondary"><a href="https://www.example.com' + itemurl.get('href')+'" target="_blank">View</a></button>')
        file.write('</div><small class="text-muted">')
        file.write(countdown.get('data-countdown'))
        file.write('</small></div></div></div></div></div>')
        print

    file.close()

print(var1)

#Repeated Code
file = open('/var/script/exp/exp_' + timestr + '.html', 'a+')

with open('/var/script/exp/exp_' + timestr + '.html', 'a') as file_1, open('/var/script/src/subheader.html', 'r') as file_2:
    for line in file_2:
         file_1.write(line)


for i in range(5):

    url = 'https://www.example.com/Listings?st=' + var2 + '&sg=&c=&s=&lp=0&hp=999999&p={}'.format(i)
    source = urllib.request.urlopen(url)
    soup = BeautifulSoup(source, 'html.parser')

    for products in soup.find_all('li', class_='widget'):
        image = products.find('img', class_='lazy-load')
        itemurl = products.find('a', class_='product')
        title = products.find('div', class_='title').text
        countdown = products.find(class_='product-countdown')
        price = products.find(class_='product-price').find(class_="price").text
        #print(image.get('data-src'))
        #file.write('<img src="', + image.get('data-src'), + '">')
        file = open('/var/script/exp/exp_' + timestr + '.html', 'a+')
        file.write('<div class="col-md-15 col-xs-3">')
        file.write('<div class="card mb-4 box-shadow">')
        file.write('<img class="card-img-top" src="')
        file.write(image.get('data-src'))
        file.write('" alt="Card image cap" height="200px">')
        file.write('<div class="card-body">')
        file.write('<div><p class="card-text"><a href="https://www.example.com' + itemurl.get('href')+'" target="_blank">' +  title + '</a>' + price +  '</p>')
        file.write('<div class="d-flex justify-content-between align-items-center">')
        file.write('<div class="btn-group">')
        file.write('<button type="button" class="btn btn-sm btn-outline-secondary"><a href="https://www.example.com' + itemurl.get('href')+'" target="_blank">View</a></button>')
        file.write('</div><small class="text-muted">')
        file.write(countdown.get('data-countdown'))
        file.write('</small></div></div></div></div></div>')
        print

    file.close()

print(var2)

非常感谢您的回复 所以这里是更新的代码,我可以使用作者列表,我很惊讶它得到了这个!!哈哈

from bs4 import BeautifulSoup import urllib.request import time for i in range(5): #searches through pages lines = open('C:\\Users\\ataylor_dev\\Documents\\VSCODE\\Python\\BeautifulSoup\\Training\\authors.txt').read().splitlines() for author in lines: url = 'https://www.example.com/Listings?st=' + author + '&sg=&p={}'.format(i) #adds authors and pages to print(url) #how to repeat code with next author Output: https://www.example.com/Listings?st=Stephen%20King&sg=&c=&s=&lp=0&hp=999999&p=0 https://www.example.com/Listings?st=J.%20K.%20Rowling&sg=&c=&s=&lp=0&hp=999999&p=0 https://www.example.com/Listings?st=James%20Patterson&sg=&c=&s=&lp=0&hp=999999&p=0 https://www.example.com/Listings?st=John%20Grisham&sg=&c=&s=&lp=0&hp=999999&p=0 John%20Grisham https://www.example.com/Listings?st=Stephen%20King&sg=&c=&s=&lp=0&hp=999999&p=1 https://www.example.com/Listings?st=J.%20K.%20Rowling&sg=&c=&s=&lp=0&hp=999999&p=1 https://www.example.com/Listings?st=James%20Patterson&sg=&c=&s=&lp=0&hp=999999&p=1 https://www.example.com/Listings?st=John%20Grisham&sg=&c=&s=&lp=0&hp=999999&p=1 John%20Grisham https://www.example.com/Listings?st=Stephen%20King&sg=&c=&s=&lp=0&hp=999999&p=2 https://www.example.com/Listings?st=J.%20K.%20Rowling&sg=&c=&s=&lp=0&hp=999999&p=2 https://www.example.com/Listings?st=James%20Patterson&sg=&c=&s=&lp=0&hp=999999&p=2 https://www.example.com/Listings?st=John%20Grisham&sg=&c=&s=&lp=0&hp=999999&p=2 John%20Grisham https://www.example.com/Listings?st=Stephen%20King&sg=&c=&s=&lp=0&hp=999999&p=3 https://www.example.com/Listings?st=J.%20K.%20Rowling&sg=&c=&s=&lp=0&hp=999999&p=3 https://www.example.com/Listings?st=James%20Patterson&sg=&c=&s=&lp=0&hp=999999&p=3 https://www.example.com/Listings?st=John%20Grisham&sg=&c=&s=&lp=0&hp=999999&p=3 John%20Grisham https://www.example.com/Listings?st=Stephen%20King&sg=&c=&s=&lp=0&hp=999999&p=4 https://www.example.com/Listings?st=J.%20K.%20Rowling&sg=&c=&s=&lp=0&hp=999999&p=4 https://www.example.com/Listings?st=James%20Patterson&sg=&c=&s=&lp=0&hp=999999&p=4 https://www.example.com/Listings?st=John%20Grisham&sg=&c=&s=&lp=0&hp=999999&p=4 John%20Grisham

现在如何按正确的顺序重复代码?Stephen%20King第1-5页,然后转到下一位作者。。第1-5页。你知道吗

我觉得我越来越接近了,再次感谢!你知道吗


Tags: httpsdivcomexamplehtmlwwwfindsg
1条回答
网友
1楼 · 发布于 2024-04-29 11:34:01

如果我理解正确的话,在循环1中,您希望读入来自外部文件的作者名称。请看下面的问题(和片段)

In Python, how do I read a file line-by-line into a list?

with open('filename') as f:
    lines = f.readlines()

然后,您不必为每个变量复制粘贴代码,而是希望在您读入的列表中循环。你知道吗

This Link比我能解释得更好更透彻,但这里有一个小片段。你知道吗

>>> li = ['a', 'b', 'c', 'd', 'e']
>>> for i in range(len(li)):
...     print li[i]

为了帮助您了解这对您有何帮助,请比较以下内容:

var1 = 'Stephen%20King'
var2 = 'J.%20K.%20Rowling'
var3 = 'James%20Patterson'
var4 = 'John%20Grisham'

print(var1, var2, var3, var4)

#VS

vars = []

vars.append('Stephen%20King')
vars.append('J.%20K.%20Rowling')
vars.append('James%20Patterson')
vars.append('John%20Grisham')

print(vars)
print(vars[0], vars[1], vars[2], vars[3])

for author in vars:
    print(author)

相关问题 更多 >