截断正态分布与未截断正态分布不一致?
我正在使用下面定义的函数 v_in
生成符合截断正态分布的随机数。
import numpy as np
from scipy.stats import truncnorm
from scipy.stats import norm
import matplotlib.pyplot as plt
# important parameters:
m = 9.1093837e-31 # electron mass (kg)
q = 1.6e-19 # electron charge (C)
k = 1.380649e-23 # boltzmann const (J/K)
phi = 4 * 1.6e-19 # tungsten cathode work function (J)
eps = 8.85e-12 # vacuum permittivity
d = 0.3 # tube length (m)
T_cat = 4000 # cathode temperature (K)
def v_in(T_cat, U, N):
v_min = np.sqrt(2 * phi / m)
std_dev = np.sqrt(k*T_cat/m)
return truncnorm.rvs(v_min/std_dev, np.inf, loc = 0, scale = std_dev, size = N)
我原本希望未截断的正态分布的概率密度函数(PDF)能和截断分布的随机数的直方图一致,但是:
x_axis = np.linspace(-1e6, 1e6, 10000)
plt.plot(x_axis, norm.pdf(x_axis, 0, np.sqrt(k*T_cat/m)))
plt.hist(v_in(T_cat, 0, 100000), bins = 150, histtype = 'step',
density = True, color = 'red', linewidth=1.2)
plt.show()
为什么它们看起来不一致,我该如何让它们一致呢?
1 个回答
1
听起来你想让这两条曲线看起来像是相互一致的:
你在图中看不到它们一致的原因是:
- 你的
x_axis
没有延伸得足够远,所以你计算的未截断正态分布的最大值小于截断正态分布的左截断点。 - 截断正态分布经过标准化处理,使得曲线下的面积为1,因此在相同的x坐标下,它的y值会比正态分布的y值高很多。
下面是显示一致性的代码,并附有关于更改内容的注释:
import numpy as np
from scipy.stats import truncnorm
from scipy.stats import norm
import matplotlib.pyplot as plt
# important parameters:
m = 9.1093837e-31 # electron mass (kg)
q = 1.6e-19 # electron charge (C)
k = 1.380649e-23 # boltzmann const (J/K)
phi = 4 * 1.6e-19 # tungsten cathode work function (J)
eps = 8.85e-12 # vacuum permittivity
d = 0.3 # tube length (m)
T_cat = 4000 # cathode temperature (K)
def v_in(T_cat, U, N):
v_min = np.sqrt(2 * phi / m)
std_dev = np.sqrt(k*T_cat/m)
return truncnorm.rvs(v_min/std_dev, np.inf, loc = 0, scale = std_dev, size = N)
# Calculate normalizing constant that relates the magnitudes of the two PDFs
v_min = np.sqrt(2 * phi / m)
std_dev = np.sqrt(k*T_cat/m)
normalizing_constant = 1/norm.sf(v_min/std_dev)
# Extend right limit of `x_axis`
x_axis = np.linspace(1e6, 1e7, 10000)
# Multiply the un-truncated normal PDF by the constant
plt.plot(x_axis, normalizing_constant*norm.pdf(x_axis, 0, np.sqrt(k*T_cat/m)))
# You named the function v_in, not v_in0
plt.hist(v_in(T_cat, 0, 100000), bins = 150, histtype = 'step',
density = True, color = 'red', linewidth=1.2)
# Zoom in on the part you're interested in
plt.xlim(1.1e6, 1.5e6)
plt.ylim(0, 3e-5)
plt.show()