从已实现的函数或rv_continuous构造ContinuousRV

2024-03-28 18:10:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我想在给定python实现的概率密度函数(pdf)的情况下构造一个^{}。下面是一个最小的工作示例,其最后一个语句生成一个ValueError

import numpy as np
from scipy.stats import gaussian_kde, norm
from sympy import Interval, oo, symbols
from sympy.stats import ContinuousRV
from sympy.utilities.lambdify import implemented_function

# Example Data
measures = np.concatenate([norm.rvs(loc=-2, size=64), norm.rvs(loc=3, size=32)])

# Definition of the PDF
pdf_kde = gaussian_kde(measures)
pdf_sym = implemented_function("pdf", pdf_kde)

# Create the symbolic variable
XName, x = symbols('X x')
X = ContinuousRV(XName, pdf_sym(x), set=Interval(-oo, oo))

该示例失败,出现以下错误:

.../lib/python3.8/site-packages/sympy/stats/crv_types.py in check(pdf, set)
    149         x = Dummy('x')
    150         val = integrate(pdf(x), (x, set))
--> 151         _value_check(val == S.One, "The pdf on the given set is incorrect.")
    152 
    153 

.../lib/python3.8/site-packages/sympy/stats/rv.py in _value_check(condition, message)
   1450     truth = fuzzy_and(condition)
   1451     if truth == False:
-> 1452         raise ValueError(message)
   1453     return truth == True
   1454 

ValueError: The pdf on the given set is incorrect.

我已经确认pdf是一个很好的近似值

from scipy import integrate
value, err = integrate.quad(pdf_kde, -np.inf, np.inf)
print(value, err)
>>> 0.9999999999999996 2.318795975521764e-09

我目前正在使用Python3.8.0、Sympy1.6、Scipy1.4.1和Numpy1.18.5,如果相关的话


Tags: thefromimportnormpdfvaluecheckstats
1条回答
网友
1楼 · 发布于 2024-03-28 18:10:36

^{}方法构造了^{}的一个实例,并在此过程中调用了一个检查方法,该方法失败是因为Symphy不会自动执行数值计算。可以构造一个产生所需分布并以数字方式执行所述检查的包装器

from sympy.stats.crv import SingleContinuousDistribution

def _check_dist(pdf, set):
    from sympy import Dummy, integrate, N
    x = Dummy('x')
    integrand = pdf(x)
    integral = integrate(integrand, (x, set.start, set.end))
    v = float(N(integral))
    assert np.isclose(float(v), 1.0)

def EmpiricalRV(name: str, m: np.ndarray) -> SingleContinuousDistribution:
    from scipy.stats import gaussian_kde
    from sympy import Interval, oo
    from sympy.stats import ContinuousDistributionHandmade
    from sympy.stats.crv import SingleContinuousPSpace
    from sympy.utilities.lambdify import implemented_function

    pdf_kde = gaussian_kde(m)
    pdf_sym = implemented_function(f"f_{name}", lambda y: pdf_kde(float(y)))
    domain  = Interval(-oo, oo)

    _check_dist(pdf_sym, domain)

    dist = ContinuousDistributionHandmade(pdf_sym, domain)
    pspace = SingleContinuousPSpace(name, dist)
    return pspace.value

下面的测试代码演示了随机变量的作用

from scipy.stats import norm
data = np.concatenate([norm.rvs(loc=-2, size=64), norm.rvs(loc=3, size=32)])

from sympy import N
from sympy.stats import density, Normal, E, P, median
Y = EmpiricalRV('Y',data)

ev = E(Y)
evfloat = float(N(ev))

print("Y          :", Y)
print("density(Y) :", density(Y))
print("E(Y)       :", ev)
print(f"N(E(Y))    : {evfloat:.4f}")
print(f"data.mean(): {data.mean():.4f}")

>>> Y          : Y
>>> density(Y) : ContinuousDistributionHandmade(f_Y, Interval(-oo, oo))
>>> E(Y)       : Integral(Y*f_Y(Y), (Y, -oo, oo))
>>> N(E(Y))    : -0.3882
>>> data.mean(): -0.3882

子类化^{}^{}可能还有其他价值

相关问题 更多 >