请求html。AttributeError:\uuAEXIT\uuAsyncIO错误如何修复?

2024-05-29 02:27:14 发布

您现在位置:Python中文网/ 问答频道 /正文

无法对url发出异步请求,无法从错误中获取响应。在

File "D:\Dev\Scripts\ol_as.py", line 28, in main async with requests_html.AsyncHTMLSession() as session:

AttributeError: aexit

import asyncio
import requests_html
from time import time
from bs4 import BeautifulSoup


async def fetch_content(url, session):
    async with session.get(url, allow_redirects=True) as response:
        data = await respone.read()
        respone.html.render()
        soup = BeautifulSoup(respone.html.html, 'lxml')
        txt = soup.find_all('span', {'class': 'text'})
        print(txt)




async def main():
    url = 'http://quotes.toscrape.com/js/'
    tasks = []
    async with requests_html.AsyncHTMLSession() as session:
        for i in range(10):
            tasks.append(asyncio.create_task(fetch_content(url, session)))
        await asyncio.gather(*tasks)



if __name__ == '__main__':
    t0 = time()
    asyncio.run(main())
    print(time() - t0)

Tags: inimportasynciourlasynctimemainsession
1条回答
网友
1楼 · 发布于 2024-05-29 02:27:14

你已经很接近了。从AsyncHTMLSession的实验来看,它不喜欢在上下文管理器中使用并传递给不同的协同程序。你也需要r。html.arender而不仅仅是渲染。在

如果你想要一份指定页数的引文列表,我想到的是:

from requests_html import AsyncHTMLSession
import asyncio
import json
from itertools import chain


async def get_quotes(s, url):
    r = await s.get(url)
    await r.html.arender()
    var_data = r.html.find('script', containing='var data', first=True).text

    #this part could be improved, I'm basically isolating the json rendered bit:
    *shit, var_data = var_data.split('var data =')
    var_data, *shit = var_data.split('; for (var i in data)')

    data = json.loads(var_data)
    quotes = [post['text'] for post in data]
    return quotes

async def main(max_pages=1):
    s = AsyncHTMLSession()
    tasks = []
    for page in range(1,max_pages+1):
        url = f'http://quotes.toscrape.com/js/page/{page}'
        tasks.append(get_quotes(s,url))
    results = await asyncio.gather(*tasks)
    return list(chain(*(res for res in results)))

all_quotes = asyncio.run(main(5))
print(all_quotes)

相关问题 更多 >

    热门问题