Python中的统计自助法库？

5 投票

1 回答

5276 浏览

数据工程师

提问于 2025-04-17 17:48

在Python中有没有统计学的自助法库呢？

我想要的功能跟R bootstrap里提供的差不多：

http://statistics.ats.ucla.edu/stat/r/library/bootstrap.htm

我搜索了一下，发现了：

http://mjtokelly.blogspot.com/2006/04/bootstrap-statistics-in-python.html（不过这个链接里的代码坏掉了）

http://adorio-research.org/wordpress/?p=9048

https://github.com/cgevans/scikits-bootstrap

但是上面这些似乎没有提供所有的功能（特别是概率权重）。

有没有其他的建议呢？

最近这个功能被添加到了numpy.random里。

谢谢！

开源库 numpy 数据分析机器学习统计学随机抽样统计自助法概率权重

1 个回答

如果你只是想要一个Python版本的R语言中的sample函数，可以试试这个：

import collections
import random
import bisect

def sample(xs, sample_size = None, replace=False, sample_probabilities = None):
    """Mimics the functionality of http://statistics.ats.ucla.edu/stat/r/library/bootstrap.htm sample()"""

    if not isinstance(xs, collections.Iterable):
        xs = range(xs)
    if not sample_size:
        sample_size = len(xs)            

    if not sample_probabilities:
        if replace:
            return [random.choice(xs) for _ in range(sample_size)]
        else:
            return random.sample(xs, sample_size)
    else:
        if replace:
            total, cdf = 0, []
            for x, p in zip(xs, sample_probabilities):
                total += p
                cdf.append(total)

            return [ xs[ bisect.bisect(cdf, random.uniform(0, total)) ] 
                    for _ in range(sample_size) ]
        else:            
            assert len(sample_probabilities) == len(xs)
            xps = list(zip(xs, sample_probabilities))           
            total = sum(sample_probabilities)
            result = []
            for _ in range(sample_size):
                # choose an item based on weights, and remove it from future iterations.
                # this is slow (N^2), a tree structure for xps would be better (NlogN)
                target = random.uniform(0, total)
                current_total = 0                
                for index, (x,p) in enumerate(xps):
                    current_total += p
                    if current_total > target:
                        xps.pop(index)
                        result.append(x)
                        total -= p
                        break
            return result

回答于 2025-04-17 由 Python大师

分享举报

Python中的统计自助法库？

1 个回答

撰写回答