scipy.stats.binom库能否针对特定的"k"和"p"返回"N"值
我有一个我希望是简单的问题。我正在尝试学习SciPy,作为一种爱好/闲暇活动(是的,我真的应该多出去走走),目前正在玩弄scipy.stats.binom这个二项概率函数(这也没帮助,因为我已经有20多年没学过统计了)。
所以我给自己设定了下面这个问题。
"""
Binomial probability problem
The planet Klom is famous for two things. The superabundance of
dilithium crystals which are so common they are simply lying all
over the ground; and their notoriously poor quality as only 10%
of them are any good. Good and bad crystals are indistinguishable
to the naked eye.
Captain Kirk requires four good crystals to power his spaceship
and beams down to the surface. He can only carry 12 crystals back
to his ship.
What is the probability of him returning with 0, 1, 2, .. 12
good crystals.
Plot the probability
How many crystals should he bring back if he wants a better than 50,
90, 95 and 99% probability of having at least four working ones.
"""
import scipy.stats
import numpy
import matplotlib.pyplot
[N, p] = [12, 0.1]
rv = scipy.stats.binom(N, p)
x = numpy.arange(0, N+1)
pmf = rv.pmf(x)
for k in range(len(pmf)):
print("{:3d} {:10.6f}".format(k, pmf[k]))
fig = matplotlib.pyplot.figure()
ax = fig.add_subplot(111)
ax.plot(x, pmf, 'bo')
ax.vlines(x, 0, pmf, lw=2)
ax.set_xlabel('Number of Good crystals')
ax.set_ylabel('Binomial PDF')
matplotlib.pyplot.show()
显然,我可以解决这个问题的第一部分。但是调用 scipy.stats.binom(N, p)
会创建一个固定的概率质量函数(PMF),在这个函数中,N和p的值是固定的,而对于给定的k计算出的概率也是固定的。
除了使用暴力循环,我怎么才能算出给Kirk提供50%、90%等几率的N值呢?(也就是说,实际上是为给定的k和p计算N)。我相信一定有某个函数可以做到这一点,但我就是找不到。
谢谢。
Tim。
1 个回答
1
因为我们这里处理的是离散分布,所以需要用到 scipy.optimize
这个库。
第一部分比较简单:
In [131]:
import scipy.stats as ss
import scipy.optimize as so
In [132]:
ss.binom.pmf(range(12), n=12, p=0.1)
Out[132]:
array([ 2.82429536e-01, 3.76572715e-01, 2.30127770e-01,
8.52325076e-02, 2.13081269e-02, 3.78811145e-03,
4.91051484e-04, 4.67668080e-05, 3.24769500e-06,
1.60380000e-07, 5.34600000e-09, 1.08000000e-10])
对于第二部分,我们需要通过方程 PMF(4, N, 0.1)=0.5/0.1/0.05/0.01 来求解 N。这里我们会用到 scipy.optimize
,并且需要使用 Nelder-Mead
这个优化器,因为它不需要导数的信息,所以很适合我们现在面对的离散问题。
In [133]:
for P in [0.5, 0.1, 0.05, 0.01]:
N=int(so.fmin(lambda n, p: (ss.binom.pmf(4, int(n), 0.1)-p)**2, 40, args=(P,), disp=0))
#the initial guess is 40, because 40*0.1=4
print N,
print ss.binom.pmf(4, N, p=0.1)
39 0.205887043441
67 0.100410451946
81 0.0498607360095
107 0.00999262321 #need to pick 107 crystals in order to be sure 99% of time that there are at least 4 usable crystals.