方便使用的实用工具,用于数据块笔记本。

databricks-utils的Python项目详细描述


数据块实用程序

Python versionPyspark versionBuild Status

databricks-utils是一个python包,它提供了几个实用程序类/func 这提高了数据块笔记本的易用性。

安装

pip install databricks-utils

功能

文件

api文档可以在https://e2fyi.github.io/databricks-utils/找到。

快速启动

s3bucket

importjsonfromdatabricks_utils.awsimportS3Bucket# need to attach notebook's dbutils# before S3Bucket can be usedS3Bucket.attach_dbutils(dbutils)# create an instance of the s3 bucketbucket=(S3Bucket("somebucketname","SOMEACCESSKEY","SOMESECRETKEY").allow_spark(sc)# local spark context.mount("somebucketname"))# mount location name (resolves as `/mnt/somebucketname`)# show list of files/folders in the bucket "resource" folderbucket.ls("resource/")# read in a json file from the bucketdata=json.load(open(bucket.local("resource/somefile.json","r")))# read from parquet via sparkdataframe=spark.read.parquet(bucket.s3("resource/somedf.parquet"))# umountbucket.umount()

vega
VegaVega-Lite 是交互式图形的高级语法。它们提供简洁的json 用于快速生成可视化以支持分析的语法。

fromdatabricks_utils.vegaimportvega_embed# vega-lite spec for a bar chartspec={"data":{"values":[{"a":"A","b":28},{"a":"B","b":55},{"a":"C","b":43},{"a":"D","b":91},{"a":"E","b":81},{"a":"F","b":53},{"a":"G","b":19},{"a":"H","b":87},{"a":"I","b":52}]},"mark":"bar","encoding":{"x":{"field":"a","type":"ordinal"},"y":{"field":"b","type":"quantitative"}}}# plot out the vega chart in databricks notebookdisplayHTML(vega_embed(spec=spec))

显影剂

# add a version to git tag and publish to pypi
. add_tag.sh <VERSION>

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java是安卓的子类。所容纳之物ClipData或ClipData。项目   毕加索中的java错误get()无法应用于(安卓.content.Context)   管道接受来自控制台和管道文本文件的Java输入   Java Windows文件权限   java如何在Selenium中找到此按钮?我尝试了partiallinktext并多次尝试cssSelector   java如何在spring jdbc模板中生成流式sql?   方法中的java全局值变为null   java设备“Mobile Intel(R)4 Series Express芯片组系列”(\\.\DISPLAY1)初始化失败:   java查找单独文件夹中的资源到类文件   java iCal4j添加会议说明   java如何处理InvalidTokenException?   oop如果Java不支持运算符重载,增量运算符如何在整数实例上工作?   java如何在Hibernate中获取<map>