Python pyspark-me包_程序模块 - PyPI

Databricks客户端SDK，带有用于Databricks REST api的命令行客户端

pyspark-me的Python项目详细描述

给我点火花

用于Python的Databricks客户端SDK，具有用于Databricks REST api的命令行接口。在

[目录]

简介

Pysparkme包提供了python SDK for Databricks REST API：

dbfs公司
工作区
工作
运行

该软件包还附带了一个有用的CLI，这可能对自动化非常有帮助。在

用于Databricks REST api的Python客户端SDK

创建Databricks连接

# Get Databricks workspace connectiondbc=pysparkme.databricks.connect(bearer_token='dapixyzabcd09rasdf',url='https://westeurope.azuredatabricks.net')

DBFS

^{pr2}$

数据库工作区

# List root workspace directorydbc.workspace.ls('/')# Check if workspace item existsdbc.workspace.exists('/explore')# Check if workspace item is a directorydbc.workspace.is_directory('/')# Export notebook in default (SOURCE) formatdbc.workspace.export('/my_notebook')# Export notebook in HTML formatdbc.workspace.export('/my_notebook','HTML')

Databricks命令行`dbr-me`

您可以使用方便的shell命令dbr-me调用Databricks CLI：

$ dbr-me --help

或者使用python模块：

$ python -m pysparkme.databricks.cli --help

要连接到Databricks集群，可以在命令行中提供参数：

--bearer-token
--url
--cluster-id

或者，可以定义环境变量。命令行参数优先。在

exportDATABRICKS_URL='https://westeurope.azuredatabricks.net/'exportDATABRICKS_BEARER_TOKEN='dapixyz89u9ufsdfd0'exportDATABRICKS_CLUSTER_ID='1234-456778-abc234'exportDATABRICKS_ORG_ID='87287878293983984'

工作区

##################### List workspace# Default path is root - '/'
dbr-me workspace ls
# auto-add leading '/'
dbr-me workspace ls 'Users'# Space-indentend json output with number of spaces
dbr-me workspace --json-indent 4 ls
# Custom indent string
dbr-me workspace ls --json-indent='>'###################### Export workspace items# Export everything in source format using defaults: format=SOURCE, path=/
dbr-me workspace export -o ./.dev/export
# Export everything in DBC format
dbr-me workspace export -f DBC -o ./.dev/export.
# When path is folder, export is recursive
dbr-me workspace export -o ./.dev/export-utils 'Utils'# Export single ITEM
dbr-me workspace export -o ./.dev/GetML 'Utils/Download MovieLens.py'

DBFS

列出DBFS项

# List items on DBFS
dbr-me dbfs ls --json-indent 3 FileStore/movielens

[{"path": "/FileStore/movielens/ml-latest-small",
      "is_dir": true,
      "file_size": 0,
      "is_file": false,
      "human_size": "0 B"}]

# Download a file and print to STDOUT
dbr-me dbfs get ml-latest-small/movies.csv

# Download recursively entire directory and store locally
dbr-me dbfs get -o ml-local ml-latest-small

运行

提交笔记本

实现：https://docs.databricks.com/dev-tools/api/latest/jobs.html#runs-submit

$ dbr-me runs submit "Utils/Download MovieLens"

{"run_id": 4}

您可以使用runs get检索作业信息：

$ dbr-me runs get 4 -i 3

获取运行元数据

实现：Databricks REST runs/get

$ dbr-me runs get -i 36

{"job_id":6,"run_id":6,"creator_user_name":"your.name@gmail.com","number_in_job":1,"original_attempt_run_id":null,"state":{"life_cycle_state":"TERMINATED","result_state":"SUCCESS","state_message":""},"schedule":null,"task":{"notebook_task":{"notebook_path":"/Utils/Download MovieLens"}},"cluster_spec":{"existing_cluster_id":"xxxx-yyyyy-zzzzzz"},"cluster_instance":{"cluster_id":"xxxx-yyyyy-zzzzzz","spark_context_id":"783487348734873873"},"overriding_parameters":null,"start_time":1592062497162,"setup_duration":0,"execution_duration":11000,"cleanup_duration":0,"trigger":null,"run_name":"pyspark-me-1592062494","run_page_url":"https://westeurope.azuredatabricks.net/?o=398348734873487#job/6/run/1","run_type":"SUBMIT_RUN"}

列表运行

实现：Databricks REST runs/list

$ dbr-me runs ls

要仅获取特定作业的运行：

# Get job with job-id=4
$ dbr-me runs ls 4 -i 3

{"runs":[{"job_id":4,"run_id":4,"creator_user_name":"your.name@gmail.com","number_in_job":1,"original_attempt_run_id":null,"state":{"life_cycle_state":"PENDING","state_message":""},"schedule":null,"task":{"notebook_task":{"notebook_path":"/Utils/Download MovieLens"}},"cluster_spec":{"existing_cluster_id":"xxxxx-yyyy-zzzzzzz"},"cluster_instance":{"cluster_id":"xxxxx-yyyy-zzzzzzz"},"overriding_parameters":null,"start_time":1592058826123,"setup_duration":0,"execution_duration":0,"cleanup_duration":0,"trigger":null,"run_name":"pyspark-me-1592058823","run_page_url":"https://westeurope.azuredatabricks.net/?o=abcdefghasdf#job/4/run/1","run_type":"SUBMIT_RUN"}],"has_more":false}

导出运行

实现：Databricks REST runs/export

$ dbr-me runs export --content-only 4 > .dev/run-view.html

获取运行输出

实现：Databricks REST runs/get-output

^{pr21}$

{"notebook_output":{"result":"Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv","truncated":false},"error":null,"metadata":{"job_id":5,"run_id":5,"creator_user_name":"your.name@gmail.com","number_in_job":1,"original_attempt_run_id":null,"state":{"life_cycle_state":"TERMINATED","result_state":"SUCCESS","state_message":""},"schedule":null,"task":{"notebook_task":{"notebook_path":"/Utils/Download MovieLens"}},"cluster_spec":{"existing_cluster_id":"xxxx-yyyyy-zzzzzzz"},"cluster_instance":{"cluster_id":"xxxx-yyyyy-zzzzzzz","spark_context_id":"8973498743973498"},"overriding_parameters":null,"start_time":1592062147101,"setup_duration":1000,"execution_duration":11000,"cleanup_duration":0,"trigger":null,"run_name":"pyspark-me-1592062135","run_page_url":"https://westeurope.azuredatabricks.net/?o=89798374987987#job/5/run/1","run_type":"SUBMIT_RUN"}}

要仅获取退出输出：

$ dbr-me runs get-output -r 6

Downloaded files: README.txt, links.csv, movies.csv, ratings.csv, tags.csv

生成和发布

python setup.py sdist bdist_wheel
python -m twine upload dist/*

欢迎加入QQ群-->： 979659372

pyspark-me 0.0.6

pyspark-me的Python项目详细描述

给我点火花

简介

用于Databricks REST api的Python客户端SDK

创建Databricks连接

DBFS

Databricks命令行`dbr-me`

工作区

DBFS

列出DBFS项

运行

提交笔记本

获取运行元数据

列表运行

导出运行

获取运行输出

生成和发布

推荐PyPI第三方库

closel

li-repo

v3io

PythonCA

xlsx_to_handontable

bitcoin-eas

pytjson

aam

plugin-manager

fsquer

rlog-generator

imjo

pyprecag

qdo

angular2tmpl

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pyspark-me 0.0.6

pyspark-me的Python项目详细描述

给我点火花

简介

用于Databricks REST api的Python客户端SDK

创建Databricks连接

DBFS

Databricks命令行dbr-me

工作区

DBFS

列出DBFS项

运行

提交笔记本

获取运行元数据

列表运行

导出运行

获取运行输出

生成和发布

推荐PyPI第三方库

closel

li-repo

v3io

PythonCA

xlsx_to_handontable

bitcoin-eas

pytjson

aam

plugin-manager

fsquer

rlog-generator

imjo

pyprecag

qdo

angular2tmpl

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

Databricks命令行`dbr-me`

导航栏

项目链接

标签