通过wfs快速访问数据的python工具
bcdata的Python项目详细描述
bcdata
可通过WFS/WCS快速访问数据中心地理数据的Python和命令行工具。
有一个wealth of British Columbia geographic information available as open
data,
但是直接的文件下载url是不可用的,并且通过ogr2ogr
和/或curl/wget
访问wfs的语法可能会很尴尬。
此Python模块和CLI尝试简化BC地理数据的下载,并与现有的Python GIS工具(如{{CD3>}和^ {CD4>})平滑地集成。
注释
- 用户有责任检查任何下载的许可,数据通常被许可为OGL-BC
- 不列颠哥伦比亚省或databc未对此进行特别认可
- 使用时请小心,请勿超载
安装
$ pip install bcdata
要使用命令行工具启用数据集名称(仅限完整对象名称)的自动完成,请按照此guide将此行添加到您的.bashrc
。
eval "$(_BCDATA_COMPLETE=source bcdata)"
用法
典型的用法是手动搜索DataBC Catalogue以找到感兴趣的层。找到感兴趣的数据集后,请注意检索该数据集的键。这可以是id
/package name
(url的最后一部分)或Object Name
(在Object Description
下)。
例如,对于BC Airports,这些键中的任何一个都可以工作:
- id/包名:
bc-airports
- 对象名:
WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW
python模块
>>> import bcdata
>>> geojson = bcdata.get_data('bc-airports', query="AIRPORT_NAME='Terrace (Northwest Regional) Airport'")
>>> geojson
{'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'id': 'WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW.fid-f0cdbe4_16811fe142b_-6f34', 'geometry': {'type': 'Point', ...
cli
有几个命令可用:
$ bcdata --help
Usage: bcdata [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
bc2pg Download a DataBC WFS layer to postgres - an ogr2ogr wrapper.
cat Write DataBC features to stdout as GeoJSON feature objects.
dem Dump BC DEM to TIFF
dump Write DataBC features to stdout as GeoJSON feature collection.
info Print basic metadata about a DataBC WFS layer as JSON.
list List DataBC layers available via WFS
list
$ bcdata list --help
Usage: bcdata list [OPTIONS]
List DataBC layers available via WFS
Options:
-r, --refresh Refresh the cached list
--help Show this message and exit.
info
$ bcdata info --help
Usage: bcdata info [OPTIONS] DATASET
Print basic metadata about a DataBC WFS layer as JSON.
Optionally print a single metadata item as a string.
Options:
--indent INTEGER Indentation level for JSON output
--count Print the count of features.
--name Print the datasource's name.
--help Show this message and exit.
dump
$ bcdata dump --help
Usage: bcdata dump [OPTIONS] DATASET
Dump a data layer from DataBC WFS to GeoJSON
$ bcdata dump bc-airports
$ bcdata dump bc-airports --query "AIRPORT_NAME='Victoria Harbour (Shoal Point) Heliport'"
$ bcdata dump bc-airports --bounds xmin ymin xmax ymax
The values of --bounds must be in BC Albers.
It can also be combined to read bounds of a feature dataset using Fiona:
$ bcdata dump bc-airports --bounds $(fio info aoi.shp --bounds)
Options:
--query TEXT A valid CQL or ECQL query, quote enclosed (https://docs
.geoserver.org/stable/en/user/tutorials/cql/cql_tutoria
l.html)
-o, --out_file TEXT Output file
--bounds TEXT Bounds: "left bottom right top" or "[left, bottom,
right, top]".
--help Show this message and exit.
cat
$ bcdata cat --help
Usage: bcdata cat [OPTIONS] DATASET
Download a DataBC WFS layer and write to stdout as GeoJSON feature
objects. In this case, cat does not concatenate.
Options:
--query TEXT A valid `CQL` or `ECQL` query (https://docs.geose
rver.org/stable/en/user/tutorials/cql/cql_tutoria
l.html)
--indent INTEGER Indentation level for JSON output
--bounds TEXT Bounds: "left bottom right top" or "[left,
bottom, right, top]".
--compact / --not-compact Use compact separators (',', ':').
--dst-crs, --dst_crs TEXT Destination CRS.
-p, --pagesize INTEGER Max number of records to request
-s, --sortby TEXT Name of sort field
--help Show this message and exit.
dem
$ bcdata dem --help
Usage: bcdata dem [OPTIONS]
Dump BC DEM to TIFF
Options:
-o, --out_file TEXT Output file
--bounds TEXT Bounds: "left bottom right top" or "[left,
bottom, right, top]". [required]
--dst-crs, --dst_crs TEXT Destination CRS.
-r, --resolution INTEGER
--help Show this message and exit.
bc2pg
$ bc2pg --help
Usage: bcdata bc2pg [OPTIONS] DATASET
Download a DataBC WFS layer to postgres - an ogr2ogr wrapper.
$ bcdata bc2pg bc-airports --db_url postgresql://postgres:postgres@localhost:5432/postgis
The default target database can be specified by setting the $DATABASE_URL
environment variable.
https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls
Options:
-db, --db_url TEXT SQLAlchemy database url
--table TEXT Destination table name
--schema TEXT Destination schema name
--query TEXT A valid `CQL` or `ECQL` query (https://docs.geose
rver.org/stable/en/user/tutorials/cql/cql_tutoria
l.html)
-p, --pagesize INTEGER Max number of records to request
-s, --sortby TEXT Name of sort field
-w, --max_workers INTEGER Max number of concurrent requests
--dim TEXT Force the coordinate dimension to val (valid
values are XY, XYZ)
--fid TEXT Primary key of dataset
--help Show this message and exit.
cli示例
搜索机场的数据列表:
$ bcdata list | grep AIRPORTS
WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW
描述一个数据集。注意,如果我们知道数据集的id,我们可以使用它而不是对象名:
$ bcdata info bc-airports --indent 2
{
"name": "WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW",
"count": 455,
"schema": {
"properties": {
"CUSTODIAN_ORG_DESCRIPTION": "string",
"BUSINESS_CATEGORY_CLASS": "string",
"BUSINESS_CATEGORY_DESCRIPTION": "string",
"OCCUPANT_TYPE_DESCRIPTION": "string",
...etc...
},
"geometry": "GeometryCollection",
"geometry_column": "SHAPE"
}
}
json输出可以用jq操作。例如,仅显示数据集中可用的字段:
$ bcdata info bc-airports | jq '.schema.properties'
{
"CUSTODIAN_ORG_DESCRIPTION": "string",
"BUSINESS_CATEGORY_CLASS": "string",
"BUSINESS_CATEGORY_DESCRIPTION": "string",
"OCCUPANT_TYPE_DESCRIPTION": "string",
etc...
}
将数据转储到geojson(^{
$ bcdata dump bc-airports > bc-airports.geojson
获取单个功能并将其发送给geojsonio(需要geojson-cli)。请注意在提供给--query
选项的cql过滤器周围需要双引号。
$ bcdata dump \
WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW \
--query "AIRPORT_NAME='Terrace (Northwest Regional) Airport'" \
| geojsonio
将层保存到BC Albers中的地质包:
$ bcdata cat bc-airports --dst-crs EPSG:3005 \
| fio collect \
| fio load -f GPKG --dst-crs EPSG:3005 airports.gpkg
将层加载到Postgres:
$ bcdata bc2pg \
bc-airports \
--db_url postgresql://postgres:postgres@localhost:5432/postgis
预测/CRS
cli
bcdata dump
返回wgs84中的geojson(EPSG:4326
)。
bcdata cat
提供--dst-crs
选项,使用wfs服务器支持的任何crs。
bcdata bc2pg
将数据加载到bc albers中的postgresql(EPSG:3005
)。
python模块
bcdata.get_data()
默认为EPSG:4236
,但是可以指定任何crs(服务器将接受)。
开发和测试
$ mkdir bcdata_env
$ virtualenv bcdata_env
$ source bcdata_env/bin/activate
(bcdata_env)$ git clone git@github.com:smnorris/bcdata.git
(bcdata_env)$ cd bcdata
(bcdata_env)$ pip install -e .[test]
(bcdata_env)$ py.test
其他实现
OWSLib具有基本的wfs功能
gdal/卷曲/wget:
# list all layers # querying the endpoint this way doesn't seem to work with `VERSION=2.0.0` ogrinfo WFS:http://openmaps.gov.bc.ca/geo/ows?VERSION=1.1.0 # define a request url for airports airports_url="https://openmaps.gov.bc.ca/geo/pub/WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW/wfs?service=WFS&version=2.0.0&request=GetFeature&typeName=WHSE_IMAGERY_AND_BASE_MAPS.GSR_AIRPORTS_SVW&outputFormat=json&SRSNAME=epsg%3A3005" # describe airports ogrinfo -so $airports_url OGRGeoJSON # dump airports to geojson ogr2ogr \ -f GeoJSON \ airports.geojson \ $airports_url # load airports to postgres ogr2ogr \ -f PostgreSQL \ PG:"host=localhost user=postgres dbname=postgis password=postgres" \ -lco SCHEMA=whse_imagery_and_base_maps \ -lco GEOMETRY_NAME=geom \ -nln gsr_airports_svw \ $airports_url # Try requesting a larger dataset - ungulate winter range uwr_url="https://openmaps.gov.bc.ca/geo/pub/WHSE_WILDLIFE_MANAGEMENT.WCP_UNGULATE_WINTER_RANGE_SP/wfs?service=WFS&version=2.0.0&request=GetFeature&typeName=WHSE_WILDLIFE_MANAGEMENT.WCP_UNGULATE_WINTER_RANGE_SP&outputFormat=json&SRSNAME=epsg%3A3005" # The request only returns the first 10,000 records ogr2ogr \ uwr.shp \ -dsco OGR_WFS_PAGING_ALLOWED=ON \ $uwr_url # wget works too, but still only 10k records wget -O uwr.geojson $uwr_url