密度函数之和(基于直方图)不等于1

2024-04-27 09:09:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图生成一个密度函数,但生成的直方图的分量之和似乎不接近1

这是什么原因?如何使密度函数之和接近(即使不完全等于)1

最简单的例子:

import numpy as np
x = np.random.normal(0, 0.5, 1000) # mu, sigma, num
bins = np.linspace(min(x), max(x), num=50) # lower and upper bounds
hist, hist_bins = np.histogram(x, bins=bins, density = True)

print(np.sum(hist))
>>> 10.4614

如果未指定箱子边缘,则输出较小但仍大于1:

import numpy as np
x = np.random.normal(0, 0.5, 1000) # mu, sigma, num
hist, hist_bins = np.histogram(x, density = True)

print(np.sum(hist))
>>> 3.1332

Tags: 函数importnumpyasnprandomdensitysigma
1条回答
网友
1楼 · 发布于 2024-04-27 09:09:20

此行为的原因is stated in the docs

density: bool, optional

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.

此外,还提供了一个样本,表明直方图之和不等于1.0:

import numpy as np

a = np.arange(5)
hist, bin_edges = np.histogram(a, density=True)

print(hist)
# hist  > [0.5, 0. , 0.5, 0. , 0. , 0.5, 0. , 0.5, 0. , 0.5]

print(hist.sum())
#  > 2.4999999999999996

print(np.sum(hist * np.diff(bin_edges)))
#  > 1.0

因此,我们可以将此应用于您的代码段:

x = np.random.normal(0, 0.5, 1000) # mu, sigma, num
bins = np.linspace(min(x), max(x), num=50) # lower and upper bounds
hist, hist_bins = np.histogram(x, bins=bins, density=True)

print(hist)

print(np.sum(hist))

print(np.sum(hist * np.diff(hist_bins)))
#  > 1.0

此外,您应该考虑如何选择垃圾箱,并确保使用.linspace()是一种合理的方法

相关问题 更多 >