Python scrapy-feedstreaming包_程序模块 - PyPI

基于scrapy.extensions.feedexport.FeedExporter到实时流数据

scrapy-feedstreaming的Python项目详细描述

刮痧流

垃圾直播数据。scrapy.extensions.feedexport.FeedExporter在刮取期间分叉以导出项。看到了吗 [https://medium.com/@alex\u ber/scrapy-数据流-cdf97434dc15]

看到了吗变更日志.md详细说明。在

获得帮助

快速启动

python3 -m pip install -U scrapy-feedstreaming

从Github安装

^{pr2}$

可选安装测试要求。在

python3 -m pip install -U https://github.com/alex-ber/scrapy-feedstreaming/archive/master.zip#egg=alex-ber-utils[tests]

或者明确：

wget https://github.com/alex-ber/scrapy-feedstreaming/archive/master.zip -O master.zip; unzip master.zip; rm master.zip

然后从源代码安装。在

从源安装

python3 -m pip install -r req.txt # only installs "required" (relaxed)

python3 -m pip install . # only installs "required"

python3 -m pip install .[tests]# installs dependencies for tests

或者，您可以从需求文件安装：

python3 -m pip install -r requirements.txt # only installs "required"

python3 -m pip install -r requirements-tests.txt # installs dependencies for tests

从目录中设置.py在

python3 setup.py test#run all tests

或者

pytest

安装新版本

见https://docs.python.org/3.1/distutils/uploading.html

python3 setup.py sdist upload

要求

scrapy feedstreaming需要以下模块。在

Python 3.6+

变更日志

垃圾直播数据。scrapy.extensions.feedexport.FeedExporter在刮取期间分叉以导出项。看到了吗 [https://medium.com/@alex\u ber/scrapy-数据流-cdf97434dc15]

这个项目的所有显著变化都将记录在这个文件中。在

{a2}

[未释放]

[0.0.1]-2020年7月12日

添加

缓冲已添加到item_scraped()。在
S3FeedStorage：可以指定ACL作为URI的查询部分。在
S3FeedStorage:添加了对region的支持。在
FEEDS:slot_key_param:New（在scray中不可用）指定以item和spider作为参数的（global）函数和slot_key。给定通过管道传递到要发送它的URI的项。回到noop method–一个什么都不做的方法。在
FEEDS:buff_capacity:New（在scray中不可用）是€“在你想要导出它们的数量之后。回落值为1。在
_FeedSlot实例是根据您的设置创建的。它们是根据提供的URI创建的。存储了一些额外的（与刮削相比）信息，即：

uri_template–€“它可以通过public API get_slots（）方法获得，见下文。在
spider_name–€“在public API get_slots（）方法中用于限制请求的spider返回的插槽。在
buff_capacity“缓冲区”™s容量，如果项目数超过此数字，则刷新缓冲区
buff–存储所有待导出项的缓冲区。在

FeedExported有1个额外的公共方法

get_slots()€“此方法用于获取进纸槽€™s信息（见上述实施说明）。它由设置填充。例如，您可以检索要导出项的任意一个URI。注：

slot_key是如上所述的插槽标识符。如果只有1个URI，则不能为该值提供任何URI。在
你可以取回饲料槽€™我们的信息只来自你的蜘蛛。在
它有可选的force_create=True参数。如果你€™在垃圾生命周期的早期调用这个方法™的信息可能尚未创建。在这种情况下，默认行为是创建此信息并为您返回。如果提供force_create=False，您将收到一个空的feed slot€™s信息。在

在S3FeedStorage上有两个公共方法：

botocore_session
botocore_client
botocore_base_kwargs–设置中提供的botocore_client.put_object()方法的最小参数dict。在
botocore_kwargs–设置中提供的所有参数for botocore_client.put_object()方法的dict。例如，如果提供，它将包含ACL参数，而botocore_base_kwargs将不包含它。在

更改

可以有多个URI用于导出。在
发送项目的逻辑已从close_spider()移动到item_scraped()。在
后端端口修复在FeedExporter.close_spider()中缺少storage.store()调用[https://github.com/scrapy/scray/pull/4626]在
后端端口修复重复的源日志[https://github.com/scrapy/scray/pull/4629]在

移除

已删除已弃用：如果找不到botocore库，则回退到boto库
删除已弃用：从项目中隐式检索设置–现在显式传递设置

欢迎加入QQ群-->： 979659372

scrapy-feedstreaming 0.0.1

scrapy-feedstreaming的Python项目详细描述

刮痧流

获得帮助

快速启动

从Github安装

从源安装

或者，您可以从需求文件安装：

从目录中设置.py在
python3 setup.py test#run all tests
或者
pytest

安装新版本

要求

变更日志

[未释放]

[0.0.1]-2020年7月12日

添加

更改

移除

推荐PyPI第三方库

ventu

albion-similog

datacatalog-api

mypy-boto3-dlm

THP-Sensing

omicronscala

sejong-downloader

text-classification-prova-alessandro-artoni

django-polls-su

gunicorn-torif

jbs-utils

anki-ocr-gui

nemf

chromatose

hello-world-fbrilej

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

scrapy-feedstreaming 0.0.1

scrapy-feedstreaming的Python项目详细描述

刮痧流

获得帮助

快速启动

从Github安装

从源安装

或者，您可以从需求文件安装：

从目录中设置.py在python3 setup.py test#run all tests或者pytest

安装新版本

要求

变更日志

[未释放]

[0.0.1]-2020年7月12日

添加

更改

移除

推荐PyPI第三方库

ventu

albion-similog

datacatalog-api

mypy-boto3-dlm

THP-Sensing

omicronscala

sejong-downloader

text-classification-prova-alessandro-artoni

django-polls-su

gunicorn-torif

jbs-utils

anki-ocr-gui

nemf

chromatose

hello-world-fbrilej

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

从目录中设置.py在
python3 setup.py test#run all tests
或者
pytest

导航栏

项目链接

标签