将RPN标准文件(来自加拿大环境)转换为NetCDF文件。
fstd2nc的Python项目详细描述
概述
此模块提供了一种机制,可以通过python或命令行在fstd和netcdf文件格式之间进行转换。
基本用法
从命令行
python -m fstd2nc [options] <infile> <outfile>
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--no-progress Disable the progress bar.
--minimal-metadata Don't include RPN record attributes and other internal
information in the output metadata. This is the
default behaviour.
--rpnstd-metadata Include all RPN record attributes in the output
metadata.
--rpnstd-metadata-list nomvar,...
Specify a minimal set of RPN record attributes to
include in the output file.
--ignore-typvar Tells the converter to ignore the typvar when deciding
if two records are part of the same field. Default is
to split the variable on different typvars.
--ignore-etiket Tells the converter to ignore the etiket when deciding
if two records are part of the same field. Default is
to split the variable on different etikets.
--vars VAR1,VAR2,... Comma-separated list of variables to convert. By
default, all variables are converted.
--fill-value FILL_VALUE
The fill value to use for masked (missing) data. Gets
stored as '_FillValue' attribute in the metadata.
Default is '1e+30'.
--datev, --squash-forecasts
Use the date of validity for the "time" axis. This is
the default.
--dateo, --forecast-axis
Use the date of original analysis for the time axis,
and put the forecast times into a separate "forecast"
axis.
--ensembles Collect different etikets for the same variable
together into an "ensemble" axis.
--profile-momentum-vars VAR1,VAR2,...
Comma-separated list of variables that use momentum
levels.
--profile-thermodynamic-vars VAR1,VAR2,...
Comma-separated list of variables that use
thermodynamic levels.
--missing-bottom-profile-level
Assume the bottom level of the profile data is
missing.
--strict-vcoord-match
Require the IP1/IP2/IP3 parameters of the vertical
coordinate to match the IG1/IG2/IG3 paramters of the
field in order to be used. The default behaviour is to
use the vertical record anyway if it's the only one in
the file.
--diag-as-model-level
Treat diagnostic (near-surface) data as model level
'1.0'. Normally, this data goes in a separate variable
because it has incompatible units for the vertical
coordinate. Use this option if your variables are
getting split with suffixes '_vgrid4' and '_vgrid5',
and you'd rather keep both sets of levels together in
one variable.
--ignore-diag-level Ignore data on diagnostic (near-surface) height.
--subgrid-axis For data on supergrids, split the subgrids along a
"subgrid" axis. The default is to leave the subgrids
stacked together as they are in the RPN file.
--filter CONDITION Subset RPN file records using the given criteria. For
example, to convert only 24-hour forecasts you could
use --filter ip2==24
--exclude NAME,NAME,...
Exclude some axes or derived variables from the
output. Note that axes will only be excluded if they
have a length of 1.
--metadata-file METADATA_FILE
Use metadata from the specified file. You can repeat
this option multiple times to build metadata from
different sources.
--rename OLDNAME=NEWNAME,...
Apply the specified name changes to the variables.
--time-units {seconds,minutes,hours,days}
The units for the output time axis. Default is hours.
--reference-date YYYY-MM-DD
The reference date for the output time axis. The
default is the starting date in the RPN file.
--msglvl {0,DEBUG,2,INFORM,4,WARNIN,6,ERRORS,8,FATALE,10,SYSTEM,CATAST}
How much information to print to stdout during the
conversion. Default is WARNIN.
--nc-format {NETCDF4,NETCDF4_CLASSIC,NETCDF3_CLASSIC,NETCDF3_64BIT_OFFSET,NETCDF3_64BIT_DATA}
Which variant of netCDF to write. Default is NETCDF4.
--zlib Turn on compression for the netCDF file. Only works
for NETCDF4 and NETCDF4_CLASSIC formats.
-f, --force Overwrite the output file if it already exists.
--no-history Don't put the command-line invocation in the netCDF
metadata.
在python脚本中使用
简单转换
importfstd2ncdata=fstd2nc.Buffer("myfile.fst")data.to_netcdf("myfile.nc")
您可以使用类似于命令行参数的参数来控制fstd2nc.Buffer
。通常的约定是来自命令行的--arg name将作为来自python的arg name传递。
例如:
importfstd2nc# Select only TT,HU variables.data=fstd2nc.Buffer("myfile.fst",vars=['TT','HU'])# Set the reference date to Jan 1, 2000 in the netCDF file.data.to_netcdf("myfile.nc",reference_date='2000-01-01')
与xarray的接口
对于更复杂的转换,可以使用to_xarray()
方法将数据操作为xarray.Dataset对象:
importfstd2nc# Open the FSTD file.data=fstd2nc.Buffer("myfile.fst")# Access the data as an xarray.Dataset object.dataset=data.to_xarray()print(dataset)# Convert surface pressure to Pa.dataset['P0']*=100dataset['P0'].attrs['units']='Pa'# (Can further manipulate the dataset here)# ...# Write the final result to netCDF using xarray:dataset.to_netcdf("myfile.nc")
与IRIS接口
您可以使用.to_iris()
方法与iris接口(需要iris版本2.0或更高版本)。
这将给您一个iris.cube.CubeList对象:
importfstd2ncimportiris.quickplotasqpfrommatplotlibimportpyplotaspl# Open the FSTD file.data=fstd2nc.Buffer("myfile.fst")# Access the data as an iris.cube.CubeList object.cubes=data.to_iris()print(cubes)# Plot all the data (assuming we have 2D fields)forcubeincubes:qp.contourf(cube)pl.gca().coastlines()pl.show()
与Pygeode接口
您可以使用.to_pygeode()
方法创建pygeode.Dataset对象(需要Pygeode 1.2.2或更高版本):
importfstd2nc# Open the FSTD file.data=fstd2nc.Buffer("myfile.fst")# Access the data as a pygeode.Dataset object.dataset=data.to_pygeode()print(dataset)
安装
最简单的安装方法是使用pip:
pip install fstd2nc
如果您正在将多个输入文件处理为单个netcdf文件,则可以通过运行以下命令获得一些有用的功能(进度条、快速文件扫描):
pip install fstd2nc[manyfiles]
或者,您可以在conda环境中安装它:
conda install -c neishm fstd2nc
在Pydap服务器中使用
这个包包含一个Pydap的处理程序,它使您能够通过opendap协议提供fstd文件。
安装所有先决条件:
pip install pydap fstd2nc[dap]
然后,您可以运行
pydap -d [your data directory]
要求
基本要求
这个包需要Python-RPN来读/写fstd文件,需要netcdf4-python来读/写netcdf文件。
可选要求
对于读取大量输入文件(>;100),此实用程序可以利用pandas快速处理fstd记录头。
使用--progress
选项需要progress模块。
python方法需要.to_iris()
包以及.to_xarray()
依赖项。
python方法需要pygeode包以及.to_xarray()
依赖项。