Python：从n维数组定义的离散分布中采样

2 投票

2 回答

2207 浏览

提问于 2025-04-18 12:10

在Python中，有没有一个函数可以从一个n维的numpy数组中随机抽样，并返回每次抽样的索引？如果没有，应该怎么定义这样一个函数呢？

比如：

>>> probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])  
>>> print function(probabilities, draws = 10)
 ([1,1],[0,2],[1,1],[1,0],[0,1],[0,1],[1,1],[0,0],[1,1],[0,1])

我知道这个问题在处理一维数组时可以用很多方法解决。但是，我要处理的是大型的n维数组，不能为了进行一次抽样就把它们重塑成其他形状。

numpy n维数组随机抽样离散分布索引返回

2 个回答

如果你的数组在内存中是连续的，你可以直接改变这个数组的形状：

probabilities = np.array([[.1, .2, .1], [.05, .5, .05]]) 
nrow, ncol = probabilities.shape
idx = np.arange( nrow * ncol ) # create 1D index

probabilities.shape = ( 6, ) # this is OK because your array is contiguous in memory

samples = np.random.choice( idx, 10, p=probabilities ) # sample in 1D
rowIndex = samples / nrow # convert to 2D
colIndex = samples % ncol

array([2, 0, 1, 0, 2, 2, 2, 2, 2, 0])
array([1, 1, 2, 0, 1, 1, 1, 1, 1, 1])

需要注意的是，由于你的数组在内存中是连续的，reshape 也会返回一个视图：

In [53]:

view = probabilities.reshape( 6, -1 )
view[ 0 ] = 9
probabilities[ 0, 0 ]
Out[53]:
9.0

回答于 2025-04-18 由 Python大师

分享举报

你可以使用 np.unravel_index 这个函数：

a = np.random.rand(3, 4, 5)
a /= a.sum()

def sample(a, n=1):
    a = np.asarray(a)
    choices = np.prod(a.shape)
    index = np.random.choice(choices, size=n, p=a.ravel())
    return np.unravel_index(index, dims=a.shape)

>>> sample(a, 4)
(array([2, 2, 0, 2]), array([0, 1, 3, 2]), array([2, 4, 2, 1]))

这个函数会返回一个包含多个数组的元组，每个数组对应于 a 的一个维度，数组的长度是你请求的样本数量。如果你想要一个形状为 (样本数, 维度数) 的数组，可以把返回的语句改成：

return np.column_stack(np.unravel_index(index, dims=a.shape))

现在：

>>> sample(a, 4)
array([[2, 0, 0],
       [2, 2, 4],
       [2, 0, 0],
       [1, 0, 4]])

回答于 2025-04-18 由 Python大师

分享举报

Python：从n维数组定义的离散分布中采样

2 个回答

撰写回答