Python netcdf：复制所有变量和属性但一个

18 投票

6 回答

20892 浏览

数据工程师

提问于 2025-04-17 17:28

我需要处理一个netcdf文件中的单个变量，但这个变量实际上包含了很多属性和其他变量。我觉得更新一个netcdf文件是不可能的（可以参考这个问题如何在Scientific.IO.NetCDF.NetCDFFile中删除一个变量？）

我的处理步骤如下：

从原始文件中获取要处理的变量
处理这个变量
将原始netcdf文件中除了处理过的变量以外的所有数据复制到最终文件
将处理过的变量复制到最终文件

我现在的问题是如何编写第3步的代码。我开始写了以下内容：

def  processing(infile, variable, outfile):
        data = fileH.variables[variable][:]

        # do processing on data...

        # and now save the result
        fileH = NetCDFFile(infile, mode="r")
        outfile = NetCDFFile(outfile, mode='w')
        # build a list of variables without the processed variable
        listOfVariables = list( itertools.ifilter( lamdba x:x!=variable , fileH.variables.keys() ) )
        for ivar in listOfVariables:
             # here I need to write each variable and each attribute

我该如何用少量代码保存所有数据和属性，而不需要重建整个数据结构呢？

数据结构文件操作数据处理属性管理数据格式转换科学计算 netcdf 变量复制

6 个回答

这个回答是基于Xavier Ho的那个回答（https://stackoverflow.com/a/32002401/7666），不过我在里面加了一些我需要的修正：

import netCDF4 as nc
import numpy as np
toexclude = ["TO_REMOVE"]
with nc.Dataset("orig.nc") as src, nc.Dataset("filtered.nc", "w") as dst:
    # copy attributes
    for name in src.ncattrs():
        dst.setncattr(name, src.getncattr(name))
    # copy dimensions
    for name, dimension in src.dimensions.iteritems():
        dst.createDimension(
            name, (len(dimension) if not dimension.isunlimited else None))
    # copy all file data except for the excluded
    for name, variable in src.variables.iteritems():
        if name not in toexclude:
            x = dst.createVariable(name, variable.datatype, variable.dimensions)
            dst.variables[name][:] = src.variables[name][:]

回答于 2025-04-17 由 Python大师

分享举报

如果你只是想复制文件并选择一些变量，nccopy是个很不错的工具，正如@rewfuss所提到的。

这里有一个更灵活的解决方案，使用的是python-netcdf4。这个方法让你可以在写入文件之前，先打开文件进行处理和其他计算。

with netCDF4.Dataset(file1) as src, netCDF4.Dataset(file2) as dst:

  for name, dimension in src.dimensions.iteritems():
    dst.createDimension(name, len(dimension) if not dimension.isunlimited() else None)

  for name, variable in src.variables.iteritems():

    # take out the variable you don't want
    if name == 'some_variable': 
      continue

    x = dst.createVariable(name, variable.datatype, variable.dimensions)
    dst.variables[x][:] = src.variables[x][:]

不过，这个方法没有考虑到变量的属性，比如fill_values。你可以根据文档轻松处理这些属性。

要注意的是，一旦用这种方式写入或创建了netCDF4文件，就无法撤销了。只要你修改了变量，它就会在with语句结束时写入文件，或者当你对Dataset调用.close()时。

当然，如果你想在写入之前处理变量，就要小心创建哪些维度。在新文件中，绝不要在没有创建变量的情况下写入数据。同时，也不要在没有定义维度的情况下创建变量，正如在python-netcdf4的文档中提到的那样。

回答于 2025-04-17 由 Python大师

分享举报

我刚刚用的这个方法有效果。@arne的回答更新到了Python 3，并且还包括了复制变量属性的内容：

import netCDF4 as nc
toexclude = ['ExcludeVar1', 'ExcludeVar2']

with netCDF4.Dataset("in.nc") as src, netCDF4.Dataset("out.nc", "w") as dst:
    # copy global attributes all at once via dictionary
    dst.setncatts(src.__dict__)
    # copy dimensions
    for name, dimension in src.dimensions.items():
        dst.createDimension(
            name, (len(dimension) if not dimension.isunlimited() else None))
    # copy all file data except for the excluded
    for name, variable in src.variables.items():
        if name not in toexclude:
            x = dst.createVariable(name, variable.datatype, variable.dimensions)
            dst[name][:] = src[name][:]
            # copy variable attributes all at once via dictionary
            dst[name].setncatts(src[name].__dict__)

回答于 2025-04-17 由 Python大师

分享举报

Python netcdf：复制所有变量和属性但一个

6 个回答

撰写回答