Python openpdi包_程序模块 - PyPI

用于处理提交给PDI的数据的库。

openpdi的Python项目详细描述

打开pdi

openpdi是一项非正式的工作，它记录和标准化提交给 Police Data Initiative（pdi）。目标是使数据更容易访问通过解决与即标准化，

文件类型：一些代理使用 Socrata Open Data API，许多人提供他们的数据在不同结构的原始.csv、.xlsx或.xls文件中。
列名：表示相同数据的许多列（例如，race 警察）在不同的部门、城市和州有不同的名字。
值格式：在很多不同的格式。
列可用性：当前很难识别数据包含某些列的源-例如使用force数据指明有关人员的雇用日期。

开始

安装

$ pip install openpdi

用法

Dataset	ID	Source
Use of Force	^{}	https://www.policedatainitiative.org/datasets/use-of-force/

importcsvimportopenpdi# The library has a single entry point:dataset=openpdi.Dataset(# The dataset ID (see the table above)."uof",# Limit the data sources to a specific state using its two-letter code.## Default: `scope=[]`.scope=["TX"],# A list of columns that must be provided in every data source included in# this dataset. See `openpdi/meta/{ID}/schema.json` for the available# columns.## Default: `columns=[]`.columns=["reason"],# If `True`, only return the user-specified columns -- i.e., those listed# in the `columns` parameter.## Default: `strict=False`.strict=False)# The names of the agencies included in this dataset:print(dataset.agencies)# The URLs of the external data sources inlcuded in this dataset:print(dataset.sources)# `gen` is a generator object for iterating over the CSV-formatted dataset.gen=dataset.download()# Write to a CSV file:withopen("dataset.csv","w+")asf:writer=csv.writer(f,delimiter=",",quoting=csv.QUOTE_ALL)writer.writerows(gen)

数据集

为了避免不必要的膨胀（就gbs而言），我们实际上不在此存储库中存储任何pdi数据。相反，我们存储外部托管数据集的小的、json格式的描述，例如，^{}：

[{"url":"https://www.norwichct.org/Archive.aspx?AMID=61&Type=Recent","type":"csv","start":1,"columns":{"date":{"index":0,"specifier":"%m/%d/%Y"},"city":{"raw":"Richmond"},"state":{"raw":"CA"},"service_type":{"index":1},"force_type":{"index":10},"light_conditions":{"index":8},"weather_conditions":{"index":7},"reason":{"index":2},"officer_injured":{"index":6},"officer_race":{"index":9},"subject_injured":{"index":5},"aggravating_factors":{"index":3},"arrested":{"index":4}}}]

此文件描述了来自加利福尼亚州里士满的force（uof）数据集的使用。数组中的每个条目都将一列从外部托管数据映射到数据集架构文件（^{}）中的一列。

flow

schema.json文件将format分配给特定数据集中的每个可能的列，这是一个python函数，负责标准化原始列值（请参见^{}）。

欢迎加入QQ群-->： 979659372

openpdi 0.1.3

openpdi的Python项目详细描述

打开pdi

开始

安装
$ pip install openpdi

用法

数据集

推荐PyPI第三方库

import-parent-dir

pypushwoosh

sc.base.cdn

facterp

odoo10-addon-l10n-it-website-sale-corrispettivi

f27-cohorts

Lab-3-Part2

madarrays

gc-facebook-sdk

odoo8-addon-stock-reserve-sale

diff-match-patch

django-stravauth

atquant

pyFRF

egor

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

openpdi 0.1.3

openpdi的Python项目详细描述

打开pdi

开始

安装 $ pip install openpdi

用法

数据集

推荐PyPI第三方库

import-parent-dir

pypushwoosh

sc.base.cdn

facterp

odoo10-addon-l10n-it-website-sale-corrispettivi

f27-cohorts

Lab-3-Part2

madarrays

gc-facebook-sdk

odoo8-addon-stock-reserve-sale

diff-match-patch

django-stravauth

atquant

pyFRF

egor

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

安装
$ pip install openpdi

导航栏

项目链接

标签