多处理器字符串求值算法的实现

def evaluator( baseStr, listOfStr ) : for word in listOfStr : # PARALLELIZE THIS scoreList += [ evaluateTwoWords(baseStr, word) ]; def evaluateTwoWords(baseStr, otherStr) : SOME WORD-WISE COMPARISON i = 0; j = 0; while i < len(baseStr) and j < len(word) : ... return someScore;

1条回答

网友

1楼 · 发布于 2024-04-25 01:57:35

对于上面提供的代码，是的，如果GPU上的每个线程/工作线程都分配了一个任务来进行字符串比较，那么可以在GPU上实现显著的加速。在

但GPU有一些限制。

1) If the string list to be loaded into the device memory is too huge,then  
   lost of system bandwidth is utilized to copy the string list from the 
   host to device memory. This context switch is one of the biggest setbacks 
   of using a GPU

2) Also a GPU becomes very effective in solving algorithms that have a lot 
   of SIMD(Single Instruction Multiple Data) characteristics. Check  
   this out for more info on SIMD https://en.wikipedia.org/wiki/SIMD. So the 
   more you start deviating from  SIMD,  the more penaltiy on speedup

下面是程序的Pycuda版本示例

我使用过PyCuda，但也有其他openclpython驱动程序可以完成这项工作。由于硬件限制，我还没有测试下面的GPU代码，但我主要是从这些示例http://wiki.tiker.net/PyCuda/Examples编写的。在

这就是代码的作用。在

1）将字符串列表复制到gpu设备内存

2）将基字符串复制到设备内存

3）然后调用内核函数返回

4）最后使用求和或所需的reduce来减少返回值您选择的功能

下面的代码是SIMD的一个完美例子，其中一个线程的结果独立于另一个线程的结果。但这只是一个理想的情况。您可能需要决定一个算法是否是GPU的一个好候选。在

^{pr2}$

希望这有帮助！在

相关问题更多 >

编程相关推荐

热门问题

热门文章