代码教程
datapla的Python项目详细描述
数据处理指南
The one stop shop to learn about data intake, processing, and visualization.
这本Dataplay手册使用了Datalabs指南中介绍的技术。在
安装
代码在PyPI上,因此您可以运行:
pip install dataplay geopandas dexplot
从终端安装代码及其依赖项
如何使用
将已安装的模块导入代码并按如下方式使用:
^{pr2}$以及
from dataplay.merge import mergeDatasets
mergeDatasets(left_ds=False, right_ds=False, crosswalk_ds=False, use_crosswalk = True, left_col=False, right_col=False, crosswalk_left_col = False, crosswalk_right_col = False, merge_how=False, interactive=True)
下面是一个例子:
定义我们的下载参数。在
关于这些参数的更多信息可以在教程中找到!在
tract = '*'
county = '510'
state = '24'
tableId = 'B19001'
year = '17'
saveAcs = False
df = retrieve_acs_data(state, county, tract, tableId, year, saveAcs)
df.head()
Number of Columns 17
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
# Primary Table
left_ds = df
left_col = 'tract'
# Crosswalk Table
# Table: Crosswalk Census Communities
# 'TRACT2010', 'GEOID2010', 'CSA2010'
crosswalk_ds = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv'
use_crosswalk = True
crosswalk_left_col = 'TRACT2010'
crosswalk_right_col = 'GEOID2010'
# Secondary Table
# Table: Baltimore Boundaries
# 'TRACTCE10', 'GEOID10', 'CSA', 'NAME10', 'Tract', 'geometry'
right_ds = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv'
right_col ='GEOID10'
merge_how = 'geometry'
interactive = True
merge_how = 'outer'
banksPd = mergeDatasets( left_ds=left_ds, left_col=left_col,
use_crosswalk=use_crosswalk, crosswalk_ds=crosswalk_ds,
crosswalk_left_col = crosswalk_left_col, crosswalk_right_col = crosswalk_right_col,
right_ds=right_ds, right_col=right_col,
merge_how=merge_how, interactive = interactive )
Handling Left Dataset
retrieveDatasetFromUrl B19001_001E_Total \
NAME
Census Tract 1901 796
Census Tract 1902 695
Census Tract 2201 2208
Census Tract 2303 632
Census Tract 2502.07 836
... ...
Census Tract 2720.05 1219
Census Tract 1202.01 883
Census Tract 2720.04 1835
Census Tract 2720.06 1679
Baltimore City 239791
B19001_002E_Total_Less_than_$10_000 \
NAME
Census Tract 1901 237
Census Tract 1902 63
Census Tract 2201 137
Census Tract 2303 3
Census Tract 2502.07 102
... ...
Census Tract 2720.05 84
Census Tract 1202.01 78
Census Tract 2720.04 155
Census Tract 2720.06 347
Baltimore City 29106
B19001_003E_Total_$10_000_to_$14_999 \
NAME
Census Tract 1901 76
Census Tract 1902 87
Census Tract 2201 229
Census Tract 2303 20
Census Tract 2502.07 28
... ...
Census Tract 2720.05 41
Census Tract 1202.01 27
Census Tract 2720.04 109
Census Tract 2720.06 165
Baltimore City 15759
... \
NAME ...
Census Tract 1901 ...
Census Tract 1902 ...
Census Tract 2201 ...
Census Tract 2303 ...
Census Tract 2502.07 ...
... ...
Census Tract 2720.05 ...
Census Tract 1202.01 ...
Census Tract 2720.04 ...
Census Tract 2720.06 ...
Baltimore City ...
state \
NAME
Census Tract 1901 24
Census Tract 1902 24
Census Tract 2201 24
Census Tract 2303 24
Census Tract 2502.07 24
... ...
Census Tract 2720.05 24
Census Tract 1202.01 24
Census Tract 2720.04 24
Census Tract 2720.06 24
Baltimore City 24
county \
NAME
Census Tract 1901 510
Census Tract 1902 510
Census Tract 2201 510
Census Tract 2303 510
Census Tract 2502.07 510
... ...
Census Tract 2720.05 510
Census Tract 1202.01 510
Census Tract 2720.04 510
Census Tract 2720.06 510
Baltimore City 510
tract
NAME
Census Tract 1901 190100
Census Tract 1902 190200
Census Tract 2201 220100
Census Tract 2303 230300
Census Tract 2502.07 250207
... ...
Census Tract 2720.05 272005
Census Tract 1202.01 120201
Census Tract 2720.04 272004
Census Tract 2720.06 272006
Baltimore City 10000
[201 rows x 20 columns]
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Left Dataset and Columns are Valid
Handling Right Dataset
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Right Dataset and Columns are Valid
Checking the merge_how Parameter
merge_how operator is Valid outer
checkDataSetExists False
Checking the Crosswalk Parameter
Handling Crosswalk Left Dataset Loading
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Handling Crosswalk Right Dataset Loading
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Assessment Completed
Ensuring Left->Crosswalk compatability
Ensuring Crosswalk->Right compatability
PERFORMING MERGE LEFT->CROSSWALK
left_on TRACT2010 right_on GEOID2010 how outer
PERFORMING MERGE LEFT->RIGHT
left_col GEOID2010 right_col GEOID10 how outer
Local Column Values Not Matched
[0]
1
Crosswalk Unique Column Values
[24510151000 24510080700 24510080500 24510150500 24510120100 24510090900
24510280301 24510130803 24510130700 24510130600 24510100100 24510110100
24510270501 24510270302 24510270401 24510120700 24510271200 24510110200
24510271002 24510280404 24510270804 24510260203 24510260101 24510260102
24510090800 24510090300 24510270801 24510120400 24510090200 24510271001
24510130200 24510140100 24510270600 24510270701 24510130100 24510270803
24510280200 24510280302 24510130804 24510271101 24510271102 24510150800
24510270301 24510170100 24510090500 24510170200 24510090600 24510120300
24510120500 24510130300 24510120600 24510100200 24510150400 24510261000
24510280403 24510010400 24510250303 24510260303 24510200701 24510272003
24510070200 24510280102 24510151200 24510260900 24510200400 24510261100
24510200500 24510250103 24510260301 24510200600 24510130806 24510270702
24510180200 24510190100 24510270805 24510200200 24510150702 24510270402
24510250206 24510150701 24510151100 24510040100 24510270101 24510270200
24510190200 24510271501 24510210100 24510180300 24510180100 24510150100
24510200300 24510200100 24510090700 24510190300 24510090400 24510200702
24510250500 24510280401 24510160801 24510160802 24510270703 24510220100
24510250301 24510270502 24510030100 24510020200 24510250600 24510240200
24510150900 24510020300 24510270102 24510250207 24510030200 24510250101
24510280402 24510080102 24510040200 24510200800 24510270903 24510060200
24510260800 24510160400 24510280101 24510250401 24510240400 24510250102
24510250205 24510240300 24510271802 24510060100 24510010300 24510010200
24510270902 24510010100 24510270901 24510270802 24510260605 24510250402
24510271801 24510260201 24510260401 24510271300 24510230100 24510080101
24510060300 24510140200 24510160100 24510160200 24510260404 24510150300
24510150200 24510160700 24510260202 24510271400 24510130805 24510140300
24510170300 24510080302 24510100300 24510260501 24510160300 24510130400
24510160600 24510271600 24510271700 24510151300 24510210200 24510271503
24510060400 24510250204 24510070400 24510230200 24510240100 24510020100
24510260604 24510120202 24510272007 24510272005 24510230300 24510260302
24510080200 24510080301 24510010500 24510070100 24510250203 24510070300
24510080600 24510271900 24510080400 24510120201 24510272004 24510272006
24510280500 24510260403 24510150600 24510080800 24510160500 24510090100
24510260402 24510260700]
/usr/local/lib/python3.6/dist-packages/pandas/core/ops/array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
res_values = method(rvalues)
banksPd.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
type(banksPd)
pandas.core.frame.DataFrame
from dataplay.geoms import readInGeometryData
csaMap = readInGeometryData(url=banksPd, porg='g', geom='geometry', lat=False, lng=False, revgeocode=False, save=False, in_crs=2248, out_crs=2248)
isGeoDataframe
RECIEVED url: B19001_001E_Total \
0 796
1 695
2 2208
3 632
4 836
.. ...
195 1848
196 1219
197 883
198 1835
199 1679
B19001_002E_Total_Less_than_$10_000 \
0 237
1 63
2 137
3 3
4 102
.. ...
195 153
196 84
197 78
198 155
199 347
B19001_003E_Total_$10_000_to_$14_999 \
0 76
1 87
2 229
3 20
4 28
.. ...
195 68
196 41
197 27
198 109
199 165
... \
0 ...
1 ...
2 ...
3 ...
4 ...
.. ...
195 ...
196 ...
197 ...
198 ...
199 ...
CSA \
0 Southwest Baltimore
1 Southwest Baltimore
2 Inner Harbor/Fed...
3 South Baltimore
4 Cherry Hill
.. ...
195 Glen-Fallstaff
196 Cross-Country/Ch...
197 Greater Charles ...
198 Cross-Country/Ch...
199 Glen-Fallstaff
Tract \
0 1901.0
1 1902.0
2 2201.0
3 2303.0
4 2502.0
.. ...
195 2720.0
196 2720.0
197 1202.0
198 2720.0
199 2720.0
geometry
0 POLYGON ((-76.63...
1 POLYGON ((-76.63...
2 MULTIPOLYGON (((...
3 MULTIPOLYGON (((...
4 POLYGON ((-76.62...
.. ...
195 POLYGON ((-76.69...
196 POLYGON ((-76.69...
197 POLYGON ((-76.60...
198 POLYGON ((-76.69...
199 POLYGON ((-76.68...
[200 rows x 27 columns],
porg: g,
geom: geometry,
lat: False,
lng: False,
revgeocode: False,
in_crs: 2248,
out_crs: 2248
Index(['B19001_001E_Total',
'B19001_002E_Total_Less_than_$10_000',
'B19001_003E_Total_$10_000_to_$14_999',
'B19001_004E_Total_$15_000_to_$19_999',
'B19001_005E_Total_$20_000_to_$24_999',
'B19001_006E_Total_$25_000_to_$29_999',
'B19001_007E_Total_$30_000_to_$34_999',
'B19001_008E_Total_$35_000_to_$39_999',
'B19001_009E_Total_$40_000_to_$44_999',
'B19001_010E_Total_$45_000_to_$49_999',
'B19001_011E_Total_$50_000_to_$59_999',
'B19001_012E_Total_$60_000_to_$74_999',
'B19001_013E_Total_$75_000_to_$99_999',
'B19001_014E_Total_$100_000_to_$124_999',
'B19001_015E_Total_$125_000_to_$149_999',
'B19001_016E_Total_$150_000_to_$199_999',
'B19001_017E_Total_$200_000_or_more',
'state',
'county',
'tract',
'GEOID2010',
'TRACTCE10',
'GEOID10',
'NAME10',
'CSA',
'Tract',
'geometry'],
dtype='object')
csaMap.columns
Index(['B19001_001E_Total',
'B19001_002E_Total_Less_than_$10_000',
'B19001_003E_Total_$10_000_to_$14_999',
'B19001_004E_Total_$15_000_to_$19_999',
'B19001_005E_Total_$20_000_to_$24_999',
'B19001_006E_Total_$25_000_to_$29_999',
'B19001_007E_Total_$30_000_to_$34_999',
'B19001_008E_Total_$35_000_to_$39_999',
'B19001_009E_Total_$40_000_to_$44_999',
'B19001_010E_Total_$45_000_to_$49_999',
'B19001_011E_Total_$50_000_to_$59_999',
'B19001_012E_Total_$60_000_to_$74_999',
'B19001_013E_Total_$75_000_to_$99_999',
'B19001_014E_Total_$100_000_to_$124_999',
'B19001_015E_Total_$125_000_to_$149_999',
'B19001_016E_Total_$150_000_to_$199_999',
'B19001_017E_Total_$200_000_or_more',
'state',
'county',
'tract',
'GEOID2010',
'TRACTCE10',
'GEOID10',
'NAME10',
'CSA',
'Tract',
'geometry'],
dtype='object')
csaMap.plot(column='B19001_002E_Total_Less_than_$10_000')
<matplotlib.axes._subplots.AxesSubplot at 0x7f277d7b0630>
^{pr21}$
foodPantryLocationsUrl = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vT3lG0n542sIGE2O-C8fiXx-qUZG2WDO6ezRGcNsS4z8MM30XocVZ90P1UQOIXO2w/pub?gid=1152681223&single=true&output=csv'
crs = {'init' :'epsg:2248'}
foodPantryLocations = readInGeometryData(url=foodPantryLocationsUrl, porg='p', geom=False, lat='Y', lng='X', revgeocode=False, save=False, in_crs=crs, out_crs=crs)
panp = workWithGeometryData( 'pandp', foodPantryLocations[ foodPantryLocations.City_1 == 'Baltimore' ], csaMap, pntsClr='red', polyColorCol='B19001_002E_Total_Less_than_$10_000')
RECIEVED url: https://docs.google.com/spreadsheets/d/e/2PACX-1vT3lG0n542sIGE2O-C8fiXx-qUZG2WDO6ezRGcNsS4z8MM30XocVZ90P1UQOIXO2w/pub?gid=1152681223&single=true&output=csv,
porg: p,
geom: False,
lat: Y,
lng: X,
revgeocode: False,
in_crs: {'init': 'epsg:2248'},
out_crs: {'init': 'epsg:2248'}
Index(['X',
'Y',
'OBJECTID',
'Name',
'Address',
'City_1',
'State',
'Zip',
'# in Zip',
'FIPS'],
dtype='object')
mapPointsandPolygons
/usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
return _prepare_from_string(" ".join(pjargs))
from dataplay.geoms import map_points
map_points(foodPantryLocations, lat_col='Y', lon_col='X', zoom_start=11, plot_points=True, pt_radius=15, draw_heatmap=True, heat_map_weights_col=None, heat_map_weights_normalize=True, heat_map_radius=15)
/usr/local/lib/python3.6/dist-packages/dataplay/geoms.py:190: FutureWarning: Method `add_children` is deprecated. Please use `add_child` instead.
curr_map.add_children(plugins.HeatMap(stations, radius=heat_map_radius))
合法的
免责声明
视图表达: 本教程中表达的所有观点都是作者自己的观点,并不代表他们曾经、现在或将要加入的任何实体的意见。在
责任、错误和疏忽: 作者对资料的可靠性没有保证。作者不负责更新本教程,也不负责维护其性能状态。在任何情况下,作者或其附属机构均不对因本教程引起的或与本教程有关的任何间接的、间接的、或特殊的和或惩戒性的损害负责。信息按“原样”提供,并有明显的错误和意见。在内容中找到的信息附有MIT许可证。有关更多信息,请参阅许可证。在
危险使用: 您对本教程中的信息所采取的任何行动都将完全由您自己承担风险,作者将不承担与使用本教程和后续产品有关的任何损失和损害。在
合理使用 本网站包含受版权保护的材料,其使用并不总是经过版权所有人的特别授权。虽然无意非法使用受版权保护的作品,但为了提高科学素养,可能会出现提供此类材料的情况。我们认为这构成了美国版权法第107条规定的任何此类受版权保护材料的“合理使用”。根据Titile 17 U.S.C.第108节的规定,本教程的材料将免费分发给那些事先表示有兴趣接收所含信息用于研究和教育的人。在
有关详细信息,请转到:http://www.law.cornell.edu/uscode/17/107.shtml。如果您希望将本网站的受版权保护的内容用于超出“合理使用”的目的,则必须获得版权所有人的许可。在
License
版权所有©2019 BNIA-JFI
特此免费授予获得本软件和相关文档文件(以下简称“软件”)副本的任何人无限制地使用本软件,包括但不限于使用、复制、修改、合并、发布、分发、再许可和/或出售软件副本的权利,并允许向其提供软件的人提供软件,但须符合以下条件:
上述版权声明和本许可声明应包含在软件的所有副本或主要部分中。在
本软件按“原样”提供,无任何明示或暗示的保证,包括但不限于适销性、特定用途适用性和非侵权性的保证。在任何情况下,作者或版权持有人对任何索赔、损害赔偿或其他责任概不负责,不论是在合同诉讼、侵权诉讼或其他诉讼中因软件或软件的使用或其他交易而产生、产生或与之相关。在
- 项目
标签: