Python中计算大列表标准差的更有效方法

2条回答

网友

1楼 · 编辑于 2024-04-26 09:29:33

看起来您需要一个扩展的标准偏差，为此我将使用pandas库和pandas.Series.expanding方法：

In [156]: main()[:5]
Out[156]: 
[0.7071067811865476,
 1.0,
 1.2909944487358056,
 1.5811388300841898,
 1.8708286933869707]

In [157]: pd.Series(range(20000)).expanding().std()[:5]
Out[157]: 
0         NaN
1    0.707107
2    1.000000
3    1.290994
4    1.581139
dtype: float64

如果需要，可以轻松地将第一个元素切片并转换为列表：

^{pr2}$

虽然Series是一种比list更有用的数据类型，但肯定更具性能：

In [159]: %timeit pd.Series(range(20000)).expanding().std()
1.07 ms ± 30.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

网友

2楼 · 编辑于 2024-04-26 09:29:33

您可以跟踪值和平方值的总和：

from math import sqrt

a = range(0,20000)

def sdevs(a):
    sds = [0]
    n = 1
    sum_x = a[0]
    sum_x_squared = a[0]**2

    for x in a[1:]:
        sum_x += x
        sum_x_squared += x**2
        n += 1
        # as noted by @Andrey Tyukin, statistics.stdev returns
        # the unbiased estimator, hence the n/(n-1)
        sd = sqrt(n/(n-1)*(sum_x_squared/n - (sum_x/n)**2))
        sds.append(sd)
    return sds

sds = sdevs(a)
print(sds[10000])
# 2887.184355042123

在一台有10年历史的电脑上，这需要24毫秒

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python中计算大列表标准差的更有效方法

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >