Python jf包_程序模块 - PyPI

python jsonl查询引擎

jf的Python项目详细描述

JF

jf，又称“jndex fingers”或更常见的“json过滤器管道”，是用python编写的jq克隆。它支持对python oneliners的求值，使它特别吸引那些习惯于使用python。

安装

pip install jf

它是如何工作的

JF的工作原理是通过映射/筛选管道。管道是从表示逗号分隔的列表筛选器和映射器。查询解析器假设管道的每个函数都从生成器中读取项。这个生成器作为函数的最后一个非关键字参数，因此 “映射（转换）”解释为“映射（转换，输入生成器）”。前一个函数的结果作为管道中的下一个函数。显示了管道转换以下为伪代码：

def build_pipeline(input, conversions):
    pipeline = input
    for convert in conversions:
        pipeline = convert(pipeline)
    return pipeline

然后迭代前一个函数生成的管道并打印给用户。管道的基本构造块是

map（val）=将项映射到新对象
filter（cond）=filter只显示匹配条件的项
update（val）=更新项目值
hide（dict_keys）=从输出中隐藏dict_keys

一些内置函数头已被重新建模，以更直观有了框架。最值得注意的是排序函数，它通常将键定义为关键字参数。这是因为通过写入“sorted（x.id）”按id对项目排序似乎比 “已排序（key=lambda x:x.id）”。对其他一些有用功能：

islice（stop）=>；islice（arr，start=0，stop，step=1）
islice（开始，停止，步骤=1）=>；islice（arr，开始，停止，步骤）
第一个（n=1）=>；islice（arr，n）
最后（n=1）=>；iter（deque（arr，maxlen=n））
i=arr（=标识操作）
yield_from（x）=>；从x产生项目
分组依据（key）=>；按数据键值分组项目
chain（）=>；chain（*arr）-将项目合并到列表中

对于日期时间处理，两个有用的helper函数由默认值：

用于将字符串解析为python datetime对象的日期（字符串）
用于计算now（）和date（string）之间的时间增量的age（string）

这些对于根据时间戳对中的项进行排序或筛选非常有用。其中一些函数具有预定义的别名，如head（）、tail（）， yield_all（）、group（）和reduce_list（）。

对于缩写语法，'{…}'解释为'map（{…}）'和（…）被解释为筛选器（…）。

基本用法

筛选选定字段

$ cat samples.jsonl | jf 'map({id: x.id, subject: x.fields.subject})'
{"id": "87086895", "subject": "Swedish children stories"}
{"id": "87114792", "subject": "New Finnish storybooks"}

筛选选定项目

$ cat samples.jsonl | jf 'map({id: x.id, subject: x.fields.subject}),
        filter(x.id == "87114792")'
{"id": "87114792", "subject": "New Finnish storybooks"}

用缩短的语法筛选选定的项目

$ cat samples.jsonl | jf '{id: x.id, subject: x.fields.subject},
        (x.id == "87114792")'
{"id": "87114792", "subject": "New Finnish storybooks"}

筛选选定值

$ cat samples.jsonl | jf 'map(x.id)'
"87086895"
"87114792"

按年龄筛选项目（并输出yaml）

$ cat samples.jsonl | jf 'map({id: x.id, datetime: x["content-datetime"]}),
        filter(age(x.datetime) > age("456 days")),
        update({age: age(x.datetime)})' --indent=5 --yaml
age: 457 days, 4:07:54.932587
datetime: '2016-10-29 10:55:42+03:00'
id: '87086895'

按年龄对项目排序并打印其ID、长度和年龄

$ cat samples.jsonl|jf 'update({age: age(x["content-datetime"])}),
        sorted(x.age),
        map(.id, "length: %d" % len(.content), .age)' --indent=3 --yaml
- '14941692'
- 'length: 63'
- 184 days, 0:02:20.421829
- '90332110'
- 'length: 191'
- 215 days, 22:15:46.403613
- '88773908'
- 'length: 80'
- 350 days, 3:11:06.412088
- '14558799'
- 'length: 1228'
- 450 days, 6:30:54.419461

在给定日期时间之后筛选项（test.json是git提交历史）：

$ jf 'update({age: age(.commit.author.date)}),
        filter(date(.commit.author.date) > date("2018-01-30T17:00:00Z")),
        sorted(x.age, reverse=True), map(.sha, .age, .commit.author.date)' test.json
[
  "68fe662966c57443ae7bf6939017f8ffa4b182c2",
  "2 days, 9:40:12.137919",
  "2018-01-30T18:35:27Z"
]
[
  "d3211e1141d8b2bf480cbbebd376b57bae9d8bdf",
  "2 days, 9:18:07.134418",
  "2018-01-30T18:57:32Z"
]
[
  "f8ba0ba559e39611bc0b63f236a3e67085fe8b40",
  "2 days, 8:50:09.129790",
  "2018-01-30T19:25:30Z"
]

导入您自己的模块并隐藏字段：

$ cat test.json|jf --import_from modules/ --import demomodule --yaml 'update({id: x.sha}),
        demomodule.timestamppipe(),
        hide("sha", "committer", "parents", "html_url", "author", "commit",
             "comments_url"), islice(3,5)'
- Pipemod: was here at 2018-01-31 09:26:12.366465
  id: f5f879dd7303c35fa3712586af1e7df884a5b98b
  url: https://api.github.com/repos/alhoo/jf/commits/f5f879dd7303c35fa3712586af1e7df884a5b98b
- Pipemod: was here at 2018-01-31 09:26:12.368438
  id: b393d09215efc4fc0382dd82ec3f38ae59a287e5
  url: https://api.github.com/repos/alhoo/jf/commits/b393d09215efc4fc0382dd82ec3f38ae59a287e5

阅读yaml:

$ cat test.yaml | jf --yamli 'update({id: x.sha, age: age(x.commit.author.date)}),
        filter(x.age < age("1 days"))' --indent=2 --yaml
- age: 0 days, 22:45:56.388477
  author:
    avatar_url: https://avatars1.githubusercontent.com/u/8501204?v=4
    events_url: https://api.github.com/users/hyyry/events{/privacy}
    followers_url: https://api.github.com/users/hyyry/followers
    ...

组重复（年龄在同一小时内）：

$ cat test.json|jf --import_from modules/ --import demomodule 'update({id: x.sha}),
        sorted(.commit.author.date, reverse=True),
        demomodule.DuplicateRemover(int(age(.commit.author.date).total_seconds()/3600),
        group=1).process(lambda x: {"duplicate": x.id}),
        map(list(map(lambda y: {age: age(y.commit.author.date), id: y.id,
                     date: y.commit.author.date, duplicate_of: y["duplicate"],
                     comment: y.commit.message}, x))),
        first(2)'
[
  {
    "comment": "Add support for hiding fields",
    "duplicate_of": null,
    "id": "f8ba0ba559e39611bc0b63f236a3e67085fe8b40",
    "age": "16:19:00.102299",
    "date": "2018-01-30 19:25:30+00:00"
  },
  {
    "comment": "Enhance error handling",
    "duplicate_of": "f8ba0ba559e39611bc0b63f236a3e67085fe8b40",
    "id": "d3211e1141d8b2bf480cbbebd376b57bae9d8bdf",
    "age": "16:46:58.104188",
    "date": "2018-01-30 18:57:32+00:00"
  }
]
[
  {
    "comment": "Reduce verbosity when debugging",
    "duplicate_of": null,
    "id": "f5f879dd7303c35fa3712586af1e7df884a5b98b",
    "age": "19:26:00.106777",
    "date": "2018-01-30 16:18:30+00:00"
  },
  {
    "comment": "Print help if no input is given",
    "duplicate_of": "f5f879dd7303c35fa3712586af1e7df884a5b98b",
    "id": "b393d09215efc4fc0382dd82ec3f38ae59a287e5",
    "age": "19:35:16.108654",
    "date": "2018-01-30 16:09:14+00:00"
  }
]

使用pythonic条件操作string.split（）和复杂字符串以及内置python语法的日期格式。也可以组合包含重新库的正则表达式的能力。

$ jf --import_from modules/ --import re --import demomodule --input skype.json 'yield_from(x.messages),
        update({from: x.from.split(":")[-1], mid: x.skypeeditedid if x.skypeeditedid else x.clientmessageid}),
        sorted(age(x.composetime), reverse=True),
        demomodule.DuplicateRemover(x.mid, group=1).process(),
        map(last(x)),
        yield_from(x),
        sorted(age(.composetime), reverse=True),
        map("%s %s: %s" % (date(x.composetime).strftime("%d.%m.%Y %H:%M"), x.from, re.sub(r"(<[^>]+>)+", " ", x.content)))' --raw
27.01.2018 11:02 2296ead9324b68aef4bc105c8e90200c@thread.skype:  1518001760666 8:live:matti_3426 8:live:matti_6656 8:hyyrynen.london 8:live:suvi_56 8:jukka.mattinen
27.01.2018 11:12 matti_7626: Required competence: PHP programmer (Mika D, Markus H, Heidi), some JavaScript (e.g. for GUI)
27.01.2018 11:12 matti_7626: Matti: parameters part
27.01.2018 11:15 matti_7626: 1.) Clarify customer requirements - AP: Suvi/Joseph
27.01.2018 11:22 matti_7626: This week - initial installation and setup
27.01.2018 11:22 matti_7626: Next week (pending customer requirements) - system configuration
27.01.2018 11:25 matti_7626: configuration = parameters, configuration files (audio files, from customer, ask Suvi to request today?), add audio files to system (via GUI)
27.01.2018 11:26 matti_7626: Testing = specify how we do testing, for example written test cases by the customer.
27.01.2018 11:28 matti_7626: Need test group (testgroup 1 prob easiest to recognise says Lasse)

功能

用于输入和输出的json、jsonl和yaml文件
用于json、jsonl和yaml的bz2和gzip压缩输入
如果安装了pandas和xlrd，则支持csv和xlsx
降价表输出支持
使用map、hide、filter构造生成器管道
使用点表示法访问json dict as类以获取属性
日期时间和时间增量比较打开
日期时间和当前时间之间的时间增量的age（）
第一个（N）、最后一个（N）、islice（开始、停止、步骤）
最后一个和第一个的头和尾别名
firstnlast（n）（或headntail（n））
导入自己的模块以进行更复杂的筛选
为项之间的复杂交互支持有状态类
将过滤后的数据放入ipython进行手动数据探索
Pandas分析支持快速数据探索
用户-订购dict以保持物品有序

已知错误

ipython不能完全使用管道数据启动

欢迎加入QQ群-->： 979659372

jf 0.6.4

jf的Python项目详细描述

JF

安装

它是如何工作的

基本用法

功能

已知错误

推荐PyPI第三方库

pymain

limix-genetics

doublegit

django-url-shortening

pymkv

nearside

jicgeometr

listcompress

pyLibravatar

yaipopt

Flask-JSONSchema-Ext

pygsm

trio-websockets

botibal

odoo12-addon-account-tag-menu

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

jf 0.6.4

jf的Python项目详细描述

JF

安装

它是如何工作的

基本用法

功能

已知错误

推荐PyPI第三方库

pymain

limix-genetics

doublegit

django-url-shortening

pymkv

nearside

jicgeometr

listcompress

pyLibravatar

yaipopt

Flask-JSONSchema-Ext

pygsm

trio-websockets

botibal

odoo12-addon-account-tag-menu

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签