代码教程

datapla的Python项目详细描述


数据处理指南

The one stop shop to learn about data intake, processing, and visualization.

这本Dataplay手册使用了Datalabs指南中介绍的技术。在

Open SourceNPM LicenseActivePython VersionsGitHub last commitNo Maintenance Intended

GitHub starsGitHub watchersGitHub forksGitHub followers

TweetTwitter Follow

安装

代码在PyPI上,因此您可以运行:

pip install dataplay geopandas dexplot

从终端安装代码及其依赖项

如何使用

将已安装的模块导入代码并按如下方式使用:

^{pr2}$

以及

from dataplay.merge import mergeDatasets
mergeDatasets(left_ds=False, right_ds=False, crosswalk_ds=False,  use_crosswalk = True, left_col=False, right_col=False, crosswalk_left_col = False, crosswalk_right_col = False, merge_how=False, interactive=True)

下面是一个例子:

定义我们的下载参数。在

关于这些参数的更多信息可以在教程中找到!在

tract = '*'
county = '510'
state = '24'
tableId = 'B19001'
year = '17'
saveAcs = False
df = retrieve_acs_data(state, county, tract, tableId, year, saveAcs)
df.head()
Number of Columns 17
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
# Primary Table
left_ds = df
left_col = 'tract'

# Crosswalk Table
# Table: Crosswalk Census Communities
# 'TRACT2010', 'GEOID2010', 'CSA2010'
crosswalk_ds = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv'
use_crosswalk = True
crosswalk_left_col = 'TRACT2010'
crosswalk_right_col = 'GEOID2010'

# Secondary Table
# Table: Baltimore Boundaries
# 'TRACTCE10', 'GEOID10', 'CSA', 'NAME10', 'Tract', 'geometry'
right_ds = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv'
right_col ='GEOID10'

merge_how = 'geometry'
interactive = True
merge_how = 'outer'

banksPd = mergeDatasets( left_ds=left_ds, left_col=left_col, 
              use_crosswalk=use_crosswalk, crosswalk_ds=crosswalk_ds,
              crosswalk_left_col = crosswalk_left_col, crosswalk_right_col = crosswalk_right_col,
              right_ds=right_ds, right_col=right_col, 
              merge_how=merge_how, interactive = interactive )
 Handling Left Dataset
retrieveDatasetFromUrl                       B19001_001E_Total  \
NAME                                      
Census Tract 1901                   796   
Census Tract 1902                   695   
Census Tract 2201                  2208   
Census Tract 2303                   632   
Census Tract 2502.07                836   
...                                 ...   
Census Tract 2720.05               1219   
Census Tract 1202.01                883   
Census Tract 2720.04               1835   
Census Tract 2720.06               1679   
Baltimore City                   239791   

                      B19001_002E_Total_Less_than_$10_000  \
NAME                                                        
Census Tract 1901                     237                   
Census Tract 1902                      63                   
Census Tract 2201                     137                   
Census Tract 2303                       3                   
Census Tract 2502.07                  102                   
...                                   ...                   
Census Tract 2720.05                   84                   
Census Tract 1202.01                   78                   
Census Tract 2720.04                  155                   
Census Tract 2720.06                  347                   
Baltimore City                      29106                   

                      B19001_003E_Total_$10_000_to_$14_999  \
NAME                                                         
Census Tract 1901                      76                    
Census Tract 1902                      87                    
Census Tract 2201                     229                    
Census Tract 2303                      20                    
Census Tract 2502.07                   28                    
...                                   ...                    
Census Tract 2720.05                   41                    
Census Tract 1202.01                   27                    
Census Tract 2720.04                  109                    
Census Tract 2720.06                  165                    
Baltimore City                      15759                    

                      ...  \
NAME                  ...   
Census Tract 1901     ...   
Census Tract 1902     ...   
Census Tract 2201     ...   
Census Tract 2303     ...   
Census Tract 2502.07  ...   
...                   ...   
Census Tract 2720.05  ...   
Census Tract 1202.01  ...   
Census Tract 2720.04  ...   
Census Tract 2720.06  ...   
Baltimore City        ...   

                      state  \
NAME                          
Census Tract 1901        24   
Census Tract 1902        24   
Census Tract 2201        24   
Census Tract 2303        24   
Census Tract 2502.07     24   
...                     ...   
Census Tract 2720.05     24   
Census Tract 1202.01     24   
Census Tract 2720.04     24   
Census Tract 2720.06     24   
Baltimore City           24   

                      county  \
NAME                           
Census Tract 1901        510   
Census Tract 1902        510   
Census Tract 2201        510   
Census Tract 2303        510   
Census Tract 2502.07     510   
...                      ...   
Census Tract 2720.05     510   
Census Tract 1202.01     510   
Census Tract 2720.04     510   
Census Tract 2720.06     510   
Baltimore City           510   

                       tract  
NAME                          
Census Tract 1901     190100  
Census Tract 1902     190200  
Census Tract 2201     220100  
Census Tract 2303     230300  
Census Tract 2502.07  250207  
...                      ...  
Census Tract 2720.05  272005  
Census Tract 1202.01  120201  
Census Tract 2720.04  272004  
Census Tract 2720.06  272006  
Baltimore City         10000  

[201 rows x 20 columns]
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Left Dataset and Columns are Valid

 Handling Right Dataset
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vQ8xXdUaT17jkdK0MWTJpg3GOy6jMWeaXTlguXNjCSb8Vr_FanSZQRaTU-m811fQz4kyMFK5wcahMNY/pub?gid=886223646&single=true&output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True
Right Dataset and Columns are Valid

 Checking the merge_how Parameter
merge_how operator is Valid outer
checkDataSetExists False

 Checking the Crosswalk Parameter

 Handling Crosswalk Left Dataset Loading
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True

 Handling Crosswalk Right Dataset Loading
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists False
retrieveDatasetFromUrl https://docs.google.com/spreadsheets/d/e/2PACX-1vREwwa_s8Ix39OYGnnS_wA8flOoEkU7reIV4o3ZhlwYhLXhpNEvnOia_uHUDBvnFptkLLHHlaQNvsQE/pub?output=csv
checkDataSetExists True
checkDataSetExists True
checkDataSetExists True

 Assessment Completed

 Ensuring Left->Crosswalk compatability

 Ensuring Crosswalk->Right compatability
PERFORMING MERGE LEFT->CROSSWALK
left_on TRACT2010 right_on GEOID2010 how outer
PERFORMING MERGE LEFT->RIGHT
left_col GEOID2010 right_col GEOID10 how outer

 Local Column Values Not Matched 
[0]
1

 Crosswalk Unique Column Values
[24510151000 24510080700 24510080500 24510150500 24510120100 24510090900
 24510280301 24510130803 24510130700 24510130600 24510100100 24510110100
 24510270501 24510270302 24510270401 24510120700 24510271200 24510110200
 24510271002 24510280404 24510270804 24510260203 24510260101 24510260102
 24510090800 24510090300 24510270801 24510120400 24510090200 24510271001
 24510130200 24510140100 24510270600 24510270701 24510130100 24510270803
 24510280200 24510280302 24510130804 24510271101 24510271102 24510150800
 24510270301 24510170100 24510090500 24510170200 24510090600 24510120300
 24510120500 24510130300 24510120600 24510100200 24510150400 24510261000
 24510280403 24510010400 24510250303 24510260303 24510200701 24510272003
 24510070200 24510280102 24510151200 24510260900 24510200400 24510261100
 24510200500 24510250103 24510260301 24510200600 24510130806 24510270702
 24510180200 24510190100 24510270805 24510200200 24510150702 24510270402
 24510250206 24510150701 24510151100 24510040100 24510270101 24510270200
 24510190200 24510271501 24510210100 24510180300 24510180100 24510150100
 24510200300 24510200100 24510090700 24510190300 24510090400 24510200702
 24510250500 24510280401 24510160801 24510160802 24510270703 24510220100
 24510250301 24510270502 24510030100 24510020200 24510250600 24510240200
 24510150900 24510020300 24510270102 24510250207 24510030200 24510250101
 24510280402 24510080102 24510040200 24510200800 24510270903 24510060200
 24510260800 24510160400 24510280101 24510250401 24510240400 24510250102
 24510250205 24510240300 24510271802 24510060100 24510010300 24510010200
 24510270902 24510010100 24510270901 24510270802 24510260605 24510250402
 24510271801 24510260201 24510260401 24510271300 24510230100 24510080101
 24510060300 24510140200 24510160100 24510160200 24510260404 24510150300
 24510150200 24510160700 24510260202 24510271400 24510130805 24510140300
 24510170300 24510080302 24510100300 24510260501 24510160300 24510130400
 24510160600 24510271600 24510271700 24510151300 24510210200 24510271503
 24510060400 24510250204 24510070400 24510230200 24510240100 24510020100
 24510260604 24510120202 24510272007 24510272005 24510230300 24510260302
 24510080200 24510080301 24510010500 24510070100 24510250203 24510070300
 24510080600 24510271900 24510080400 24510120201 24510272004 24510272006
 24510280500 24510260403 24510150600 24510080800 24510160500 24510090100
 24510260402 24510260700]


/usr/local/lib/python3.6/dist-packages/pandas/core/ops/array_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  res_values = method(rvalues)
banksPd.head()
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
type(banksPd)
pandas.core.frame.DataFrame
from dataplay.geoms import readInGeometryData
csaMap = readInGeometryData(url=banksPd, porg='g', geom='geometry', lat=False, lng=False, revgeocode=False, save=False, in_crs=2248, out_crs=2248)
isGeoDataframe
RECIEVED url:      B19001_001E_Total  \
0                  796   
1                  695   
2                 2208   
3                  632   
4                  836   
..                 ...   
195               1848   
196               1219   
197                883   
198               1835   
199               1679   

     B19001_002E_Total_Less_than_$10_000  \
0                    237                   
1                     63                   
2                    137                   
3                      3                   
4                    102                   
..                   ...                   
195                  153                   
196                   84                   
197                   78                   
198                  155                   
199                  347                   

     B19001_003E_Total_$10_000_to_$14_999  \
0                     76                    
1                     87                    
2                    229                    
3                     20                    
4                     28                    
..                   ...                    
195                   68                    
196                   41                    
197                   27                    
198                  109                    
199                  165                    

     ...  \
0    ...   
1    ...   
2    ...   
3    ...   
4    ...   
..   ...   
195  ...   
196  ...   
197  ...   
198  ...   
199  ...   

                     CSA  \
0    Southwest Baltimore   
1    Southwest Baltimore   
2    Inner Harbor/Fed...   
3        South Baltimore   
4            Cherry Hill   
..                   ...   
195       Glen-Fallstaff   
196  Cross-Country/Ch...   
197  Greater Charles ...   
198  Cross-Country/Ch...   
199       Glen-Fallstaff   

      Tract  \
0    1901.0   
1    1902.0   
2    2201.0   
3    2303.0   
4    2502.0   
..      ...   
195  2720.0   
196  2720.0   
197  1202.0   
198  2720.0   
199  2720.0   

                geometry  
0    POLYGON ((-76.63...  
1    POLYGON ((-76.63...  
2    MULTIPOLYGON (((...  
3    MULTIPOLYGON (((...  
4    POLYGON ((-76.62...  
..                   ...  
195  POLYGON ((-76.69...  
196  POLYGON ((-76.69...  
197  POLYGON ((-76.60...  
198  POLYGON ((-76.69...  
199  POLYGON ((-76.68...  

[200 rows x 27 columns], 
 porg: g, 
 geom: geometry, 
 lat: False, 
 lng: False, 
 revgeocode: False, 
 in_crs: 2248, 
 out_crs: 2248
Index(['B19001_001E_Total',
       'B19001_002E_Total_Less_than_$10_000',
       'B19001_003E_Total_$10_000_to_$14_999',
       'B19001_004E_Total_$15_000_to_$19_999',
       'B19001_005E_Total_$20_000_to_$24_999',
       'B19001_006E_Total_$25_000_to_$29_999',
       'B19001_007E_Total_$30_000_to_$34_999',
       'B19001_008E_Total_$35_000_to_$39_999',
       'B19001_009E_Total_$40_000_to_$44_999',
       'B19001_010E_Total_$45_000_to_$49_999',
       'B19001_011E_Total_$50_000_to_$59_999',
       'B19001_012E_Total_$60_000_to_$74_999',
       'B19001_013E_Total_$75_000_to_$99_999',
       'B19001_014E_Total_$100_000_to_$124_999',
       'B19001_015E_Total_$125_000_to_$149_999',
       'B19001_016E_Total_$150_000_to_$199_999',
       'B19001_017E_Total_$200_000_or_more',
       'state',
       'county',
       'tract',
       'GEOID2010',
       'TRACTCE10',
       'GEOID10',
       'NAME10',
       'CSA',
       'Tract',
       'geometry'],
      dtype='object')
csaMap.columns
Index(['B19001_001E_Total',
       'B19001_002E_Total_Less_than_$10_000',
       'B19001_003E_Total_$10_000_to_$14_999',
       'B19001_004E_Total_$15_000_to_$19_999',
       'B19001_005E_Total_$20_000_to_$24_999',
       'B19001_006E_Total_$25_000_to_$29_999',
       'B19001_007E_Total_$30_000_to_$34_999',
       'B19001_008E_Total_$35_000_to_$39_999',
       'B19001_009E_Total_$40_000_to_$44_999',
       'B19001_010E_Total_$45_000_to_$49_999',
       'B19001_011E_Total_$50_000_to_$59_999',
       'B19001_012E_Total_$60_000_to_$74_999',
       'B19001_013E_Total_$75_000_to_$99_999',
       'B19001_014E_Total_$100_000_to_$124_999',
       'B19001_015E_Total_$125_000_to_$149_999',
       'B19001_016E_Total_$150_000_to_$199_999',
       'B19001_017E_Total_$200_000_or_more',
       'state',
       'county',
       'tract',
       'GEOID2010',
       'TRACTCE10',
       'GEOID10',
       'NAME10',
       'CSA',
       'Tract',
       'geometry'],
      dtype='object')
csaMap.plot(column='B19001_002E_Total_Less_than_$10_000')
<matplotlib.axes._subplots.AxesSubplot at 0x7f277d7b0630>

png

^{pr21}$
foodPantryLocationsUrl = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vT3lG0n542sIGE2O-C8fiXx-qUZG2WDO6ezRGcNsS4z8MM30XocVZ90P1UQOIXO2w/pub?gid=1152681223&single=true&output=csv'
crs = {'init' :'epsg:2248'} 
foodPantryLocations = readInGeometryData(url=foodPantryLocationsUrl, porg='p', geom=False, lat='Y', lng='X', revgeocode=False,  save=False, in_crs=crs, out_crs=crs)

panp = workWithGeometryData( 'pandp', foodPantryLocations[ foodPantryLocations.City_1 == 'Baltimore' ], csaMap, pntsClr='red', polyColorCol='B19001_002E_Total_Less_than_$10_000')
RECIEVED url: https://docs.google.com/spreadsheets/d/e/2PACX-1vT3lG0n542sIGE2O-C8fiXx-qUZG2WDO6ezRGcNsS4z8MM30XocVZ90P1UQOIXO2w/pub?gid=1152681223&single=true&output=csv, 
 porg: p, 
 geom: False, 
 lat: Y, 
 lng: X, 
 revgeocode: False, 
 in_crs: {'init': 'epsg:2248'}, 
 out_crs: {'init': 'epsg:2248'}
Index(['X',
       'Y',
       'OBJECTID',
       'Name',
       'Address',
       'City_1',
       'State',
       'Zip',
       '# in Zip',
       'FIPS'],
      dtype='object')
mapPointsandPolygons


/usr/local/lib/python3.6/dist-packages/pyproj/crs/crs.py:53: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
  return _prepare_from_string(" ".join(pjargs))

png

from dataplay.geoms import map_points
map_points(foodPantryLocations, lat_col='Y', lon_col='X', zoom_start=11, plot_points=True, pt_radius=15, draw_heatmap=True, heat_map_weights_col=None, heat_map_weights_normalize=True, heat_map_radius=15)
/usr/local/lib/python3.6/dist-packages/dataplay/geoms.py:190: FutureWarning: Method `add_children` is deprecated. Please use `add_child` instead.
  curr_map.add_children(plugins.HeatMap(stations, radius=heat_map_radius))

合法的

免责声明

视图表达: 本教程中表达的所有观点都是作者自己的观点,并不代表他们曾经、现在或将要加入的任何实体的意见。在

责任、错误和疏忽: 作者对资料的可靠性没有保证。作者不负责更新本教程,也不负责维护其性能状态。在任何情况下,作者或其附属机构均不对因本教程引起的或与本教程有关的任何间接的、间接的、或特殊的和或惩戒性的损害负责。信息按“原样”提供,并有明显的错误和意见。在内容中找到的信息附有MIT许可证。有关更多信息,请参阅许可证。在

危险使用: 您对本教程中的信息所采取的任何行动都将完全由您自己承担风险,作者将不承担与使用本教程和后续产品有关的任何损失和损害。在

合理使用 本网站包含受版权保护的材料,其使用并不总是经过版权所有人的特别授权。虽然无意非法使用受版权保护的作品,但为了提高科学素养,可能会出现提供此类材料的情况。我们认为这构成了美国版权法第107条规定的任何此类受版权保护材料的“合理使用”。根据Titile 17 U.S.C.第108节的规定,本教程的材料将免费分发给那些事先表示有兴趣接收所含信息用于研究和教育的人。在

有关详细信息,请转到:http://www.law.cornell.edu/uscode/17/107.shtml。如果您希望将本网站的受版权保护的内容用于超出“合理使用”的目的,则必须获得版权所有人的许可。在

License

版权所有©2019 BNIA-JFI

特此免费授予获得本软件和相关文档文件(以下简称“软件”)副本的任何人无限制地使用本软件,包括但不限于使用、复制、修改、合并、发布、分发、再许可和/或出售软件副本的权利,并允许向其提供软件的人提供软件,但须符合以下条件:

上述版权声明和本许可声明应包含在软件的所有副本或主要部分中。在

本软件按“原样”提供,无任何明示或暗示的保证,包括但不限于适销性、特定用途适用性和非侵权性的保证。在任何情况下,作者或版权持有人对任何索赔、损害赔偿或其他责任概不负责,不论是在合同诉讼、侵权诉讼或其他诉讼中因软件或软件的使用或其他交易而产生、产生或与之相关。在

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java将Map<String,String>传递给需要Map<String,Object>   java在循环中使用字符串而不是StringBuilder是否会造成内存损失?   jnlp如何更新java控制台JRE?   java更改、修改和重新打包CXFAPI源文件   JavaFXJava应用程序在Fedora上运行一段时间后关闭   使用来自不同类的方法的java   java如何通过ant脚本在linux中使用subst?   java在使用camunda modeler进行base64编码/解码时出错   获取java。netbeans、weblogic和fastswap设置为true时的lang.NoSuchMethodError   java如何提高FinalizerThread在GC中收集对象的优先级   java检测具有相同根的单词   netbeans crud应用程序中的java错误