慢速的scipy双重积分

Question

我正在尝试得到一个叫做 expected_W 或 H 的函数，这个函数是通过积分得到的结果：

$H(p, \theta_0, \theta_1) = \int_{-\infty}^\infty \int_{-\infty}^\infty w(p, \theta, \epsilon, \beta) f(\beta | \theta) q(\epsilon) \; d \beta \; d \epsilon$

这里面有：

theta 是一个包含两个元素的向量：theta_0 和 theta_1
f(beta | theta) 是一个关于 beta 的正态分布，均值是 theta_0，方差是 theta_1
q(epsilon) 是一个关于 epsilon 的正态分布，均值为零，方差为 sigma_epsilon（默认设置为1）。
w(p, theta, eps, beta) 是我输入的一个函数，所以我不能准确预测它的样子。它可能是非线性的，但应该不会特别复杂。

这是我实现这个问题的方法。我知道我写的包装函数可能很乱，所以如果有人能帮我改进一下，我会很感激。

from __future__ import division
from scipy import integrate
from scipy.stats import norm
import math
import numpy as np


def exp_w(w_B, sigma_eps = 1, **kwargs):
    '''
    Integrates the w_B function

    Input:
    + w_B : the function to be integrated. 
    + sigma_eps : variance of the epsilon term. Set to 1 by default
    '''

    #The integrand function gives everything under the integral:
    # w(B(p, \theta, \epsilon, \beta)) f(\beta | \theta ) q(\epsilon)
    def integrand(eps, beta, p, theta_0, theta_1, sigma_eps=sigma_eps):
        q_e = norm.pdf(eps, loc=0, scale=math.sqrt(sigma_eps))
        f_beta = norm.pdf(beta, loc=theta_0, scale=math.sqrt(theta_1))

        return w_B(p = p, 
                   theta_0 = theta_0, theta_1 = theta_1,
                   eps = eps, beta=beta)* q_e *f_beta

    #limits of integration. Using limited support for now.
    eps_inf = lambda beta : -10 # otherwise: -np.inf
    eps_sup = lambda beta : 10  # otherwise: np.inf
    beta_inf = -10
    beta_sup = 10

    def integrated_f(p, theta_0, theta_1):
        return integrate.dblquad(integrand, beta_inf, beta_sup,
            eps_inf, eps_sup,
            args = (p, theta_0, theta_1))
    # this integrated_f is the H referenced at the top of the question
    return integrated_f

我用一个简单的 w 函数测试了这个函数，对于这个函数我知道它的解析解（但通常情况下不会这样）。

def test_exp_w():
    def w_B(p, theta_0, theta_1, eps, beta):
        return 3*(p*eps + p*(theta_0 + theta_1) - beta)

    # Function that I get
    integrated = exp_w(w_B, sigma_eps = 1)

    # Function that I should get
    def exp_result(p, theta_0, theta_1):
        return 3*p*(theta_0 + theta_1) - 3*theta_0

    args = np.random.rand(3)
    d_args = {'p' : args[0], 'theta_0' : args[1], 'theta_1' : args[2]}

    if not (np.allclose(
    integrated(**d_args)[0], exp_result(**d_args)) ):
        raise Exception("Integration procedure isn't working!")

因此，我的实现似乎是有效的，但对于我的目的来说速度非常慢。我需要重复这个过程成千上万次（这是价值函数迭代中的一步。如果有人觉得相关，我可以提供更多信息）。

在 scipy 版本 0.14.0 和 numpy 版本 1.8.1 下，这个积分计算需要15秒。

有没有人有什么建议可以帮助我解决这个问题？首先，可能会有助于获取有限的积分域，但我还没弄明白怎么做，或者 SciPy 中的高斯求积法是否能很好地处理这个问题（它是否使用了高斯-赫尔米特？）。

谢谢你的时间。

---- 编辑：添加性能分析时间 -----

%lprun 的结果显示，大部分时间花在了 _distn_infraestructure.py:1529(pdf) 和 _continuous_distns.py:97(_norm_pdf) 每个函数调用次数高达83244次。

性能优化函数调用数值计算正态分布双重积分积分高斯求积法价值函数迭代

慢速的scipy双重积分

1 个回答

撰写回答