如何在没有请求的情况下创建粗糙的响应？

2024-06-05 23:36:45 发布

您现在位置：Python中文网/ 问答频道 /正文

786

网友

男 | 程序猿一只，喜欢编程写python代码。

我需要重新处理以前下载的网站没有下载他们再次。你知道吗

所以我想创建多个恶心。回应没有任何让步皮屑。请求的

这些响应应该在任何新下载的响应之前进行处理。你知道吗
假设响应内容（url，body，…）是从某处加载的。你知道吗
内置HTTP缓存不适合，因为它需要请求。。。你知道吗
我不想让蜘蛛来处理这个。你知道吗

也许一个扩展可以做到——一个中间件也可以。最小示例：

from scrapy import signals
from scrapy.http import Response

class ReprocessSnapshotsOnSpiderOpenExtension(object):

    def __init__(self, crawler):
        self.crawler = crawler
        crawler.signals.connect(self.send_the_existing_snapshots_as_new_response, signal=signals.spider_opened)

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler)

    def send_the_existing_snapshots_as_new_response(self, spider):
        print("##### now in ReprocessSnapshotsOnSpiderOpenExtension.send_the_existing_snapshots_as_new_responses()")

        response1 = Response("http://the_url_of_resp1", body=b"the body of resp1")
        response2 = Response("http://the_url_of_resp2", body=b"the body of resp2")
        # ....
        responseN = Response("http://the_url_of_respN", body=b"the body of respN")

        inject_response_somehow(response1) 
        inject_response_somehow(response2)
        # ...
        inject_response_somehow(responseN)

所以问题是：如何实现inject_response_somehow(...)？你知道吗

有没有可能控制响应被注入的位置（在下载程序中间件/spider中间件之前/中间某处/之后）？你知道吗

Tags：中间件 of the from self http url response

0条回答

目前没有回答

如何在没有请求的情况下创建粗糙的响应？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在没有请求的情况下创建粗糙的响应？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >