Python csv2es包_程序模块 - PyPI

批量导入csv或tsv到弹性搜索

csv2es的Python项目详细描述

https://img.shields.io/pypi/v/csv2es.svg

https://img.shields.io/travis/rholder/csv2es.svg

https://img.shields.io/pypi/dm/csv2es.svg

csv2es项目是一个apache 2.0许可的命令行实用程序，用 python，将csv（或tsv）文件加载到elasticsearch实例中。那是差不多。就这样。文件的第一行应包含用于ElasticSearch文档的字段名会变得很奇怪。下面有一个小技巧，可以在以防文件丢失。

功能

最小命令行接口
加载CSV或TSV的
将分隔符自定义为其他值
使用ElasticSearch批量API
并行批量上载
在出现指数后退的错误时重试

安装

要安装csv2es，只需：

$ pip install csv2es

用法

Usage: csv2es [OPTIONS]

  Bulk import a delimited file into a target Elasticsearch instance. Common
  delimited files include things like CSV and TSV.

  Load a CSV file:
    csv2es --index-name potatoes --doc-type potato --import-file potatoes.csv

  For a TSV file, note the tab delimiter option
    csv2es --index-name tomatoes --doc-type tomato --import-file tomatoes.tsv --tab

  For a nifty pipe-delimited file (delimiters must be one character):
    csv2es --index-name pipes --doc-type pipe --import-file pipes.psv --delimiter '|'

Options:
  --index-name TEXT          Index name to load data into           [required]
  --doc-type TEXT            The document type (like user_records)  [required]
  --import-file TEXT         File to import (or '-' for stdin)      [required]
  --mapping-file TEXT        JSON mapping file for index
  --delimiter TEXT           The field delimiter to use, defaults to CSV
  --tab                      Assume tab-separated, overrides delimiter
  --host TEXT                The Elasticsearch host (http://127.0.0.1:9200/)
  --docs-per-chunk INTEGER   The documents per chunk to upload (5000)
  --bytes-per-chunk INTEGER  The bytes per chunk to upload (100000)
  --parallel INTEGER         Parallel uploads to send at once, defaults to 1
  --delete-index             Delete existing index if it exists
  --quiet                    Minimize console output
  --version                  Show the version and exit.
  --help                     Show this message and exit.

示例

假设我们有一个potatos.csv文件，它的头很漂亮，如下所示：

potato_id,potato_type,description
33,sweet,"kinda oval"
17,regular,bumpy
91,regular,"perfectly round"
18,sweet,delightful
42,fried,crispy
37,"extra special",crispy

现在我们可以将其填充到ElasticSearch中：

csv2es --index-name potatoes --doc-type potato --import-file potatoes.csv

但如果是西红柿，tsv和标签分开呢？好吧，我们可以这样做：

csv2es --index-name tomatoes --doc-type tomato --import-file tomatoes.tsv --tab

高级示例

如果我们有一个超级酷的管道分隔的文件，并且想要删除现有的“管道”指数每次我们加载它？这应该能处理那个案子：

csv2es --index-name pipes --delete-index --doc-type pipe --import-file pipes.psv --delimiter '|'

ElasticSearch很好，但它对我们的文档做了一些奇怪的事情我们试着从某些领域来分析。让我们创建自己的自定义映射文件指定在ElasticSearch中使用的字段以查找名为土豆.mapping.json:

{"dynamic":"true","properties":{"potato_id":{"type":"long"},"potato_type":{"type":"string","index":"not_analyzed"},"description":{"type":"string","index":"not_analyzed"},}}

现在，让我们使用自定义映射文件加载数据：

csv2es --index-name potatoes --doc-type potato --mapping-file potatoes.mapping.json --import-file potatoes.csv

如果我的文件缺少标题行，而且它非常大，因为里面有那么多土豆，一切都很糟糕？我们可以用sed 像这样的标题不错：

sed -i 1i"potato_id,potato_type,description" potatoes.csv

只要磁盘空间大于文件大小，就可以了。

贡献

检查打开的问题或打开一个新的问题，开始围绕功能想法或错误的讨论。
在github上分叉the repository，开始对master分支（或其分支）进行更改。
编写一个测试，显示错误已修复或功能按预期工作。
发送一个pull请求并对维护程序进行bug操作，直到它被合并并发布。：）请确保将自己添加到AUTHORS。

历史记录

1.0.1（2015-06-02）

从stdin向stream添加选项

1.0.0（2015-04-23）

添加对批量上载的每个块的指数退避的重试支持
通过joblib添加并行批量上载
稳定释放

1.0.0.dev3（2015-04-19）

切换到单击以处理可执行文件
fix–删除索引标志
添加–版本选项

1.0.0.dev2（2015-04-19）

修复导入错误

1.0.0.dev1（2015-04-18）

修改文档和pypi更新

1.0.0.dev0（2015-04-18）

第一DEV版本现在存在

已应用apache 2.0许可证

完成命令行界面

正在清理运行的setup.py和测试套件

添加了Travis CI支持

欢迎加入QQ群-->： 979659372

csv2es 1.0.1

csv2es的Python项目详细描述

功能

安装

用法

示例

高级示例

贡献

历史记录

1.0.1（2015-06-02）

1.0.0（2015-04-23）

1.0.0.dev3（2015-04-19）

1.0.0.dev2（2015-04-19）

1.0.0.dev1（2015-04-18）

1.0.0.dev0（2015-04-18）

推荐PyPI第三方库

Dentacoin

pythonic-binance

pokeapi

paganini

openImagePreprocessing

impl

mangopayments

linq3

async-downloader

dawa-facade

masonite-validation

ttooongli-nester

youtube-batch

pydart2

structure

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

csv2es 1.0.1

csv2es的Python项目详细描述

功能

安装

用法

示例

高级示例

贡献

历史记录

1.0.1（2015-06-02）

1.0.0（2015-04-23）

1.0.0.dev3（2015-04-19）

1.0.0.dev2（2015-04-19）

1.0.0.dev1（2015-04-18）

1.0.0.dev0（2015-04-18）

推荐PyPI第三方库

Dentacoin

pythonic-binance

pokeapi

paganini

openImagePreprocessing

impl

mangopayments

linq3

async-downloader

dawa-facade

masonite-validation

ttooongli-nester

youtube-batch

pydart2

structure

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签