给出了两个概率分布之间的总变差距离。在
我试着用python计算它。我有两个数据集,首先从直方图计算它们的概率分布函数。然后我试图得到两个分布的最大差值。但它返回的值很小。看来我做错事了。你能帮我修一下吗?在
import scipy.stats as st
#original data has shape of [45222,1] and it is numpy array
#synthetic data has shape of [45222,1] and it is numpy array
summation = 0
minOriginal = min(original)
minGenerated = min(synthetic)
maxOriginal = max(original)
maxGenerated = max(synthetic)
minHist = min(minOriginal, minGenerated)
maxHist = max(maxOriginal, maxGenerated)
originalHist = np.histogram(original, range=(minHist, maxHist))
hist_dist1 = st.rv_histogram(originalHist)
generatedHist = np.histogram(synthetic, range=(minHist, maxHist))
hist_dist2 = st.rv_histogram(generatedHist)
x = np.linspace(minHist, maxHist, 45000)
summation += max(abs(hist_dist1.pdf(x)-hist_dist2.pdf(x)))
目前没有回答
相关问题 更多 >
编程相关推荐