擅长:python、mysql、java
<p>我做了一个函数来做这个。你可以改变带宽作为函数的参数。也就是说,较小的数字=更尖,较大的数字=更平滑。默认值为0.3。</p>
<p>它在<code>IPython notebook --pylab=inline</code>中工作</p>
<p>存储箱的数量经过优化和编码,因此会因数据中变量的数量而有所不同。</p>
<pre><code>import scipy.stats as stats
import matplotlib.pyplot as plt
import numpy as np
def hist_with_kde(data, bandwidth = 0.3):
#set number of bins using Freedman and Diaconis
q1 = np.percentile(data,25)
q3 = np.percentile(data,75)
n = len(data)**(.1/.3)
rng = max(data) - min(data)
iqr = 2*(q3-q1)
bins = int((n*rng)/iqr)
x = np.linspace(min(data),max(data),200)
kde = stats.gaussian_kde(data)
kde.covariance_factor = lambda : bandwidth
kde._compute_covariance()
plt.plot(x,kde(x),'r') # distribution function
plt.hist(data,bins=bins,normed=True) # histogram
data = np.random.randn(500)
hist_with_kde(data,0.25)
</code></pre>