<p>如果你想从任意分布中取样,你需要累积密度函数的倒数(而不是pdf)。在</p>
<p>然后从[0,1]范围内均匀地抽取一个概率,并将其输入cdf的逆函数,以得到相应的值。在</p>
<p>通常不可能从pdf分析中获得cdf。
但是,如果你愿意近似分布,你可以通过在它的域上以固定的间隔计算f(x),然后在这个向量上做一个求和得到cdf的近似值,然后从这个近似值得到逆。在</p>
<p>粗略代码片段:</p>
<pre><code>import matplotlib.pyplot as plt
import numpy as np
import scipy.interpolate
def f(x):
"""
substitute this function with your arbitrary distribution
must be positive over domain
"""
return 1/float(x)
#you should vary inputVals to cover the domain of f (for better accurracy you can
#be clever about spacing of values as well). Here i space them logarithmically
#up to 1 then at regular intervals but you could definitely do better
inputVals = np.hstack([1.**np.arange(-1000000,0,100),range(1,10000)])
#everything else should just work
funcVals = np.array([f(x) for x in inputVals])
cdf = np.zeros(len(funcVals))
diff = np.diff(funcVals)
for i in xrange(1,len(funcVals)):
cdf[i] = cdf[i-1]+funcVals[i-1]*diff[i-1]
cdf /= cdf[-1]
#you could also improve the approximation by choosing appropriate interpolator
inverseCdf = scipy.interpolate.interp1d(cdf,inputVals)
#grab 10k samples from distribution
samples = [inverseCdf(x) for x in np.random.uniform(0,1,size = 100000)]
plt.hist(samples,bins=500)
plt.show()
</code></pre>