从copu生成联合分布

2024-05-15 12:33:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个不同概率分布函数(pdf)的列表,p(x)和p(y)。我知道它们之间存在相关性,并希望生成联合分布p(x,y),以便计算它们的互信息。你知道吗

我做过研究,发现了统计中的copula理论,显然,这就是解决我问题的方法。然而,即使做了copula,我也不知道如何生成p(x,y)。我试过“copulalib”、“copula”等软件包,到目前为止,我能达到的最好效果是使用“ambhas”(不过,我不得不在课堂上修改一些东西)。这是我的密码:

import scipy as sp
import scipy.interpolate
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import rcParams
plt.rcParams.update({'font.size': 10})
rcParams.update({'figure.autolayout': True})
from ambhas.errlib import rmse, correlation
from ambhas.copula import Copula
import seaborn as sns

def log_interp1d(xx, yy, kind='linear'):
    logx = np.log10(xx)
    logy = np.log10(yy)
    lin_interp = sp.interpolate.interp1d(logx, logy, kind=kind)
    log_interp = lambda zz: np.power(10.0, lin_interp(np.log10(zz)))
    return log_interp

def derivada(f,x):
    dx = x[1] - x[0]
    flinha = []
    for i in range(1,len(f)):
        flinha.append((f[i]-f[i-1])/dx)
    return flinha

#interpolating logarithmic data and generating the cumulated distribution

xx = [1e-10, 0.00014, 0.00042, 0.0014, 0.0042, 0.014,0.07]
yy = [1e-20, 0.125, 0.275, 0.4711, 0.775, 0.875,1]
f = log_interp1d(xx,yy)
xnew = np.linspace(xx[0], xx[-1], num=10000, endpoint=True)
fda_cerc = f(xnew) #cdf
x_cerc = xnew

xx = [1e-10, 2.1e-6,2.1e-5,2.1e-4,2.1e-3, 0.007, 0.021, 0.049, 0.07]
yy = [1e-20, 0.0583, 0.1, 0.1083, 0.5083, 0.7583, 0.9917, 0.9917, 1]
f = log_interp1d(xx,yy)
fda_imida = f(xnew) #cdf
x_imida = xnew

#generatin p(x) and p(y)
pdf_imida = np.array(derivada(fda_imida,x_imida)) #p(x)
pdf_cerconil = np.array(derivada(fda_cerc,x_cerc)) #p(y)

c = correlation(pdf_imida,pdf_cerconil)


foo = Copula(pdf_imida, pdf_cerconil, 'frank')

u,v = foo.generate_uv(9999)
plt.plot(v)

h = sns.jointplot(u,v,kind = 'kde')
plt.savefig('copulas.jpg', dpi = 400)

生成的图与copula理论是一致的,但是我能做些什么来生成p(x,y)?有简单的方法吗?你知道吗


Tags: fromimportlogpdfasnppltxx