如何在十分钟内创建优化器

2条回答

网友

1楼 · 编辑于 2024-06-08 09:30:22

优化器最简单的例子可能是gradient descent optimizer。它显示了如何创建基本optimizer class的实例。优化器基类文档解释了这些方法的作用。

优化器的python端将新节点添加到计算和应用反向传播的渐变的图中。它提供传递给操作的参数，并对优化器进行一些高级管理。然后，你需要实际的“应用”操作

OPS具有Python和C++组件。编写训练操作与general process of adding an Op to TensorFlow相同（但很专业）。

有关计算和应用渐变的训练操作示例集，请参见 python/training/training_ops.py-这是用于实际训练操作的Python胶水。注意，这里的代码主要是关于形状推断-计算将是在C++中。

实际应用梯度的数学是由OP处理的（回忆一下，一般来说，OPS是用C++编写的）。在这种情况下，apply gradients操作在core/kernels/training_ops.cc中定义。例如，您可以在其中看到ApplyGradientDescentOp的实现，它引用了函子ApplyGradientDescent：

var.device(d) -= grad * lr();

操作本身的实现遵循添加操作文档中描述的任何其他操作的实现。

网友

2楼 · 编辑于 2024-06-08 09:30:22

在运行Tensorflow会话之前，应该启动一个优化器，如下所示：

# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

tf.train.GradientDescentOptimizer是类的对象，顾名思义，它实现了梯度下降算法。

调用方法minimize（）时以“cost”作为参数，它由两种方法组成compute_gradients（）和apply_gradients（）。

对于大多数（自定义）优化器实现，需要调整方法apply_gradients（）。

此方法依赖于我们将创建的（新）优化器（类）来实现以下方法：_create_slots（）、_prepare（）、_apply_dense（）和_apply_sparse（）。

\u创建插槽（）和\u准备（）创建并初始化附加变量，如动量。
\u apply_dense（）和\u apply_sparse（）实现更新变量的实际操作。

OPS通常用C++编写。不必自己更改C++头，您仍然可以通过这些方法返回一些OPS的Python包装。具体操作如下：

def _create_slots(self, var_list):
   # Create slots for allocation and later management of additional 
   # variables associated with the variables to train.
   # for example: the first and second moments.
   '''
   for v in var_list:
      self._zeros_slot(v, "m", self._name)
      self._zeros_slot(v, "v", self._name)
   '''
def _apply_dense(self, grad, var):
   #define your favourite variable update
    # for example:
   '''
   # Here we apply gradient descents by substracting the variables 
   # with the gradient times the learning_rate (defined in __init__)
   var_update = state_ops.assign_sub(var, self.learning_rate * grad) 
   '''
   #The trick is now to pass the Ops in the control_flow_ops and 
   # eventually groups any particular computation of the slots your 
   # wish to keep track of:
   # for example:    
   '''
    m_t = ...m... #do something with m and grad
    v_t = ...v... # do something with v and grad
    '''
  return control_flow_ops.group(*[var_update, m_t, v_t])

有关示例的更详细解释，请参阅此博客文章 https://www.bigdatarepublic.nl/custom-optimizer-in-tensorflow/

相关问题更多 >

编程相关推荐

热门问题

热门文章