用于执行异步任务和转换的cli工具

streamline的Python项目详细描述


流线型

此项目的目标是使数据在cli上可访问和可操作。streamline通过实现获取输入流和并行处理输入流所需的实用函数来实现这一点。

调用中的操作序列:

  1. “generator”对象从源数据流(通常是stdin)加载条目。
  2. 所选的“拖缆”按顺序给定每个输入值,并执行筛选,对这些值进行修改,然后为下一个拖缆生成这些值。
  3. “consumer”对象获取从最后一个拖缆生成的条目,并通常输出它们以供存储或查看(通常为stdout)。

安装

  • 需要python 3.6+和pip
  • pip install streamline

指南

最简单的调用不指定流操作,它只从stdin读取并将所写内容写入stdout:

  $ printf"foo\nbar"| streamline
  foo
  bar

默认情况下,streamline从stdin获取输入并写入stdout。这是非常灵活的,因为它使工具与其他CLI工具兼容。但是,您也可以使用--input--output标志来控制输出。假设您有一个文件,其中的数据与刚才用printf发送的数据相同:

  $ streamline --input my_source_file.txt --output my_target_file.txt
  $ cat my_target_file.txt
  foo
  bar

现在让我们做点没用的事。让我们使用“shell”拖缆为每个条目执行一个shell命令,并检查它们是否在监听https通信:

  $ printf"www.google.com\nslashdot.org"| streamline -s shell -- "nc -zv {value} 443"{"stdout": "", "stderr": "Connection to www.google.com 443 port [tcp/https] succeeded!\n", "exit_code": 0}{"stdout": "", "stderr": "Connection to slashdot.org 443 port [tcp/https] succeeded!\n", "exit_code": 0}

流线型模块旨在以对象的形式提供所有有用的信息,因为输出可以与其他流模块一起定制。例如,取上面的输出,只得到告诉我们端口是否打开的退出代码,我们只需添加^ {< CD4>}拖缆,将每个输出加上原始输入和^ {< CD5>}拖缆(用它的^ {CD6>}选项)将值设置为每个结果的^ {CD7>}属性:

  $ printf"www.google.com\nslashdot.org"| streamline -s shell extract headers -- "nc -zv {value} 443" --selector exit_code
  www.google.com: 0
  slashdot.org: 0

内置模块

有许多模块可用于执行异步作业和对输入的转换。要查看所有可用的模块,请使用“主帮助”选项将它们与示例一起列出:

$ streamline --help

===============Streamline===============


usage: streamline [--generator GENERATOR][--consumer CONSUMER][-s [STREAMERS [STREAMERS ...]]][-h]

optional arguments:
  --generator GENERATOR
                        Entry Generator Module
  --consumer CONSUMER   Entry Consumer/Writer Module
  -s [STREAMERS [STREAMERS ...]], --streamers [STREAMERS [STREAMERS ...]]
                        Additional streamers to apply (-s is optional)
  -h, --help            Print help
  -p {buffer,stream-output,streaming}, --progress {buffer,stream-output,streaming}
                        Print progress to stdout. ("buffer": buffers input and
                        output, "stream-output" buffers only input, "stream"for no buffering at all)
  -w WORKERS, --workers WORKERS
                        Number of concurrent workers for any one async
                        execution module to have===============Streamers===============

::extract::
	Description: Change the value to an attribute of the current value
	Example: streamline -s extract -- --selector exit_code

::py::
	Description: Translate each value by assigning it to the result of a python expression
	Example: streamline -s py -- "value.upper()"

::pyfilter::
	Description: Filter out values that dont have a truthy result to a particular python expression
	Example: streamline -s pyfilter -- "'foobar' in value"

::truthy::
	Description: Filter out values that are not truthy
	Example: streamline -s truthy -- 

::noop::
	Description: No operation. Just for testing.
	Example: streamline -s noop -- 

::split_list::
	Description: Take any values that are an array and treat each value of an array as a separate input 
	Example: streamline -s split_list -- 

::split::
	Description: Take any values that are an array and treat each value of an array as a separate input 
	Example: streamline -s split -- 

::breakdown::
	Description: Show a report of how many input values ended up with a particular result value
	Example: streamline -s breakdown -- 

::headers::
	Description: Force each value to a string and prefix each with the original input value
	Example: streamline -s headers -- 

::filter_out_errors::
	Description: Filter out any entries that have produced an error
	Example: streamline -s filter_out_errors -- 

::errors::
	Description: Use the latest error on the entry as the value
	Example: streamline -s errors -- 

::buffer::
	Description: Hold entries in memory until a certain number is reached (give no args to buffer all)
	Example: streamline -s buffer -- --buffer 20

::json::
	Description: Take json strings and parse them into objects so other streamers can inspect attributes
	Example: streamline -s json -- 

::strip::
	Description: Strip surrounding whitespace from each string entry, removing entries that are only whitespace
	Example: streamline -s strip -- --buffer 20

::head::
	Description: Only take the first X entries (Default 1)
	Example: streamline -s head -- --count 20

::readfile::
	Description: Read the file indicated by the file
	Example: streamline -s readfile -- --path ~/dir/{value}.json

::combine::
	Description: Combine two previous historical values by setting an attribute
	Example: streamline -s combine -- --source "-1" --target "-2"

::http::
	Description: Use a template to execute an HTTP request for each value
	Example: streamline -s http -- "https://{value}/"

::ssh::
	Description: Treat each value as a host to connect to. SSH in and run a command returning the output
	Example: streamline -s ssh -- "uptime"

::ssh_exec::
	Description: Copy a script to target machine and execute
	Example: streamline -s ssh_exec -- ~/dostuff.sh

::shell::
	Description: Run a shell commandfor each value
	Example: streamline -s shell -- "nc -zv {value} 22"

::scp::
	Description: Treat each value as a host to connect to. Copy a file to or from this host
	Example: streamline -s scp -- "/tmp/file.txt""{value}:/tmp/file.txt"

::sleep::
	Description: Sleep for a second (or for{value} seconds)for each entry making no change to its value
	Example: streamline -s sleep -- 

::history:push::
	Description: Start a new history tree
	Example: streamline -s history:push -- 

::history:pop::
	Description: Walk back up one level in the history tree
	Example: streamline -s history:pop -- 

::history:collapse::
	Description: Treat the latest value as the original
	Example: streamline -s history:collapse -- 

::history:reset::
	Description: Clear all levels of history
	Example: streamline -s history:reset -- 

::history:values::
	Description: Set the current value to a list of all previous values
	Example: streamline -s history:values -- 

获取特定模块运行的可用选项(用“http”替换您感兴趣的模块):

streamline -s http --help

yaml支持

支持定义拖缆及其选项的yaml文件。

给定以下yaml文件:

streamers:-name:split-name:httpoutput:codeinput:valuetarget:status_codeoptions:url:'https://{value}'`

它可以用于分割一系列域,并为每个域获取http代码:

$ echo"www.google.com www.slashdot.org"| streamline -y http_codes.yaml
{"base": "www.google.com", "status_code": 200}{"base": "www.slashdot.org", "status_code": 200}

您还可以用同样的方式指定生成器和使用者。给定以下yaml和csv文件:

generator:name:csvoptions:input:domains.csvconsumer:name:csvstreamers:-name:httpoutput:codeinput:domaintarget:status_codeoptions:url:'https://{value}'
domain,label
www.google.com,The famous big B
www.slashdot.org,Mosh pit of opinions

您可以按以下方式使用:

$ streamline -y http_codes.yaml 
domain,label,status_code
www.google.com,The famous big B,200
www.slashdot.org,Mosh pit of opinions,200

技术词汇

  • entry:围绕通过流传递的值的小包装器。通常是一行输入。
  • 生成器:一个异步生成器函数,它不接受输入并生成条目对象。
  • 执行器:一个异步函数,它接受一个值并返回一个新值。通常会完成某个工作单元,并将该工作的结果作为新值返回。
  • streamer:一个异步生成器函数,它以产生entry对象的异步源iterable作为参数。通常,拖缆对从源iterable获取的每个条目进行一些操作,然后在条目上设置一个新值。
  • 使用者:读取异步源iterable的所有项的异步函数。通常此函数会写入某些输出(如stdout)。

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java同步块与私有不可变对象和同步方法的差异   OracleDataSource、OracleUCP、Commons DBCP和Tomcat连接池之间的java差异?   java在文本文件中写入对象   java访问web服务   如何用java代码并行读取拼花地板文件   spring是否可以将运行时未知的Avro消息转换为特定的Java类   具有复杂对象的java Spring数据MongoDB addToSet()   java ArrayList是否删除元素,是否向下移动列表?   Vaadin中按钮的java多行标题不起作用   java为什么要使用@PropertySource而不是PropertyPlaceHolderConfigure?   java如何检查网站链接是否有数据(图片)或网站是否无效?   java如何禁用对jsp页面的直接访问?   用java实现matlab递归文件夹读取。伊奥。文件   为什么是java。伊奥。FileDescriptor的构造函数是公共的吗?   在java中关闭扫描程序时无法访问的代码?   搜索Java模拟退火接受概率