利用numpy随机分配基因组特征上的DNA序列读数

import numpy as np import matplotlib.pyplot as plt iterations = 1000 # number of times a read needs to be shuffled featurelength = 1000 # length of the gene a = np.zeros((iterations,featurelength)) # create a matrix with 1000 rows of the feature length b = np.arange(iterations) # a matrix with the number of iterations (0-999) reads = np.random.randint(10,50,1000) # a random dataset containing an array of DNA read lengths

1条回答

网友

1楼 · 发布于 2024-06-02 06:47:42

花式索引有点棘手，但仍有可能：

for i in reads:
    r = np.random.randint(-i,featurelength-1,iterations)
    idx = np.clip(np.arange(i)[:,None]+r, 0, featurelength-1)
    a[b,idx] += 1

要稍微解释一下，我们是：

创建一个简单的索引数组作为列向量，从0到i:np.arange(i)[:,None]
将r（一个行向量）中的每个元素相加，该元素广播以使一个大小为(i,iterations)的矩阵具有正确的偏移量到a的列中。
通过np.clip，将索引限制在[0,featurelength)范围内。
最后，我们为每一行（b）和相关列（idx）设置索引a。

相关问题更多 >

编程相关推荐

热门问题

热门文章

利用numpy随机分配基因组特征上的DNA序列读数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >