Python 如何从NetCDF文件生成的数据填补列表中的缺失数据

Question

我正在尝试写一些Python代码，目的是读取澳大利亚气象局的雨量计NetCDF文件，并提取一个流域内一组雨量计的降雨数据。这个文件的格式有点奇怪。他们选择将每个时间点的所有记录值放在一个文件里，覆盖整个澳大利亚。但是，当某个雨量计没有记录值，或者某个站点缺失时，我想找到这些缺失的站点，并简单地为它们创建一个降雨量为零的记录。我已经找到了站点的ID，但我该如何把零的记录添加到我的列表中呢？

这是代码的一部分：

# First LOOP through all files for the day and accumulate data.
for timestep, datafile in enumerate(stationdata):
    print datafile[-16:-3]
    data = netcdf.NetCDFFile(datafile, 'r')
try:
    precip = data.variables['precipitation'].data
except:
    precip = data.variables['precip'].data
try:
    stid = data.variables['station_id'].data
except:
    stid = data.variables['stid'].data
# create np array of indices of the gauge id present in the current file (Note not ALL required ones may be present!!)
idx = np.where(np.in1d(stid, gauge_ids))[0]
print 'index len = '+str(len(idx))+' Gauges: '+str(ngauges)
# This process DOES NOT SEEM to Capture Missing Gauge Data
# If a Gauge ID is not present how to we set its value to Zero for this time step ?
for i in idx:
    print i,stid[i],precip[i], timestep
    timeseries_per_station[str(stid[i])][timestep] = precip[i] # This adds the rainfall to the time series for the Station ID in the found set from its index
data.close()
# Now go through the list of Gauges ngauges with IDs gauge_ids, and fill missing ones with zero
# For stid not in gauge_ids set to Zero ... How ???
# create a Zero list and remove ID's that already have values ??
# Try    [i for i in a if i not in b]
print [k for k in gauge_ids if k not in stid]
for l in [k for k in gauge_ids if k not in stid]:
    print l, timestep
    timeseries_per_station[l][timestep] = 0.0
raw_input('check..')

代码中的这一行 for l in [k for k in gauge_ids if k not in stid]: 能够正确识别缺失的站点，但 timeseries_per_station[l][timestep] = -1.0 却导致了一个 IndexError: index out of bounds 的错误。这正是我想把缺失的数据设置为一个可识别的值的地方。

看起来当代码处理的数据段中的站点数量少于原来的数量（26个），比如只读取到25个或24个等时，就会出现这个错误？

任何线索都会非常有帮助……

另一种选择是使用不同的结构来读取数据：这个结构应该是这样的：对于每个时间片数据文件，里面有雨量计站点的数据，比如ID、纬度、经度和降水量。我想绘制降水量的空间变化以及每个时间片的空间变化。时间片数据包含在时间片文件的文件名中。

谢谢

数据结构 netcdf 数据填补缺失数据气象数据降雨量空间变化记录值

Python 如何从NetCDF文件生成的数据填补列表中的缺失数据

1 个回答

撰写回答