擅长:python、mysql、java
<p>当确定两个分布是否不同时,<a href="http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test" rel="nofollow noreferrer">KS test</a>只是它们之间最大的差别:</p>
<p><img src="https://i.stack.imgur.com/IfzdO.png" alt="enter image description here"/></p>
<p>这很简单,可以自己计算。下面的程序计算两个具有不同参数集的泊松过程的KS统计量:</p>
<pre><code>import numpy as np
N = 10**6
X = np.random.poisson(10, size=N)
X2 = np.random.poisson(7, size=N)
bins = np.arange(0, 30,1)
H1,_ = np.histogram(X , bins=bins, normed=True)
H2,_ = np.histogram(X2, bins=bins, normed=True)
D = np.abs(H1-H2)
idx = np.argmax(D)
KS = D[idx]
# Plot the results
import pylab as plt
plt.plot(H1, lw=2,label="$F_1$")
plt.plot(H2, lw=2,label="$F_2$")
text = r"KS statistic, $\sup_x |F_1(x) - F_2(x)| = {KS:.4f}$"
plt.plot(D, ' k', label=text.format(KS=KS),alpha=.8)
plt.scatter([bins[idx],],[D[idx],],s=200,lw=0,alpha=.8,color='k')
plt.axis('tight')
plt.legend()
</code></pre>
<p><img src="https://i.stack.imgur.com/lx2OS.png" alt="enter image description here"/></p>