在python的url中使用While和For循环

2024-06-16 10:17:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从一个网站上搜刮一些表格。url有两个参数,每个表id值和alpha值都会不断变化。url示例如下:

http://resources.afaqs.com/index.html?id=123&category=AD+Agencies&alpha=A

我想遍历id和alpha值。我的代码如下:

import csv
import bs4 as bs
import requests


data = ['1','2','3','7','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','W','X','Y','Z']
number = None


while number < 500:
    for i in data:
        url = "http://resources.afaqs.com/index.html?id="
        if number is not None:
            url += str(number) + "&category=AD+Agencies&alpha={}".format(i)
        print(url)

        if number is None:
            number = 1
        else:
            number += 1

这将依次迭代从1到499的id号和从A到Z的alpha值。而我想要的是:对于每个id,我想要alpha值从A迭代到Z

我试着通过在while循环之前使用for循环,在打印url之前使用for循环,等等来改变for循环。这些组合中的每一个都会给出奇怪的结果,而不是我想要的结果。你知道吗

有人能帮忙吗?你知道吗


Tags: importalphacomnoneidhttpurlnumber
2条回答

根本不使用while循环,使用嵌套的for

url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
for number in range(1,500):
    for i in data:
        print url.format(number, i)           

假设我们需要遍历id,对于每个id,遍历大写拉丁字母,我们可以编写

from string import ascii_uppercase


def get_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    urls = []
    for number in range(1, number_stop):
        for letter in ascii_uppercase:
            urls.append(url.format(number, letter))
    return urls

或者使用generator

from string import ascii_uppercase


def generate_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    for number in range(1, number_stop):
        for letter in ascii_uppercase:
            yield url.format(number, letter)

或者最后使用generator&;^{}除去多余的循环

from itertools import product
from string import ascii_uppercase


def generate_urls(number_stop):
    url = "http://resources.afaqs.com/index.html?id={}&category=AD+Agencies&alpha={}"
    for number, letter in product(range(1, number_stop),
                                  ascii_uppercase):
        yield url.format(number, letter)

相关问题 更多 >