维基百科查询+Grequests

2024-04-19 01:37:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我想查询50个维基百科页面。我一直在使用requests包来发出GET请求,但是我一直在努力实现grequests,因为我听说它的性能要好得多。你知道吗

对我来说,性能的提高实在是微乎其微。我做错什么了吗?你知道吗

import requests
import grequests
from urllib.parse import quote
from time import time

url = 'https://en.wikipedia.org/w/api.php?action=query&titles={0}&prop=pageprops&ppprop=disambiguation&format=json'
titles = ['Harriet Tubman', 'Car', 'Underground Railroad', 'American Civil War', 'Kate Larson']
urls = [url.format(quote(title)) for title in titles]

def sync_test(urls):
    results = []
    s = time()
    for url in urls:
        results.append(requests.get(url))
    e = time()
    return e-s

def async_test(urls):
    s = time()
    results = grequests.map((grequests.get(url) for url in urls))
    e = time()
    return e-s

def iterate(urls, num):
    sync_time = 0
    async_time = 0
    for i in range(num):
        sync_time += sync_test(urls)
        async_time += async_test(urls)
    print("sync_time: {}\nasync_time: {}".format(sync_time, async_time))

输出: 同步时间:8.945282936096191 异步时间:7.97578239440918

谢谢!你知道吗


Tags: intestimportformaturlforasynctime
1条回答
网友
1楼 · 发布于 2024-04-19 01:37:00
import requests
import grequests
from urllib.parse import quote
from time import time

url = 'https://en.wikipedia.org/w/api.php?action=query&titles={0}&prop=pageprops&ppprop=disambiguation&format=json'
titles = ['Harriet Tubman', 'Car', 'Underground Railroad', 'American Civil War', 'Kate Larson']
urls = [url.format(title) for title in titles]

def sync_test(urls):
    results = []
    s = time()
    for url in urls:
        results.append(requests.get(url))
    e = time()
    return e-s

def async_test(urls):
    s = time()
    results = grequests.map((grequests.get(url) for url in urls))
    e = time()
    return e-s

def iterate(urls, num):
    sync_time = 0
    async_time = 0
    for i in range(num):
        sync_time += sync_test(urls)
        async_time += async_test(urls)
    print("sync_time: {}\nasync_time: {}".format(sync_time, async_time))

if __name__ == '__main__':
    iterate(urls,10)

这让我想到:

sync_time: 22.14458918571472
async_time: 4.846134662628174

Process finished with exit code 0

我看这里没什么问题

相关问题 更多 >