PyCUDA:查询设备状态(特指内存)
PyCUDA的文档提到了一些关于驱动接口的内容,但我有点糊涂,不太明白怎么从我的代码中获取像'SHARED_SIZE_BYTES'这样的信息。
有没有人能给我一些查询设备信息的例子呢?
我想知道怎么检查设备的状态(比如在malloc/memcpy和内核启动之间),以便实现一些机器动态操作?(我希望能以一种“友好”的方式处理支持多个内核的设备。)
1 个回答
22
给其他遇到这个问题的人提个醒,花半个小时同时看着CUDA API和PyCUDA文档,效果会很好。其实这比我最开始的实验要简单得多。
运行时内核信息
即将展示的简单代码
...
kernel=mod.get_function("foo")
meminfo(kernel)
...
def meminfo(kernel):
shared=kernel.shared_size_bytes
regs=kernel.num_regs
local=kernel.local_size_bytes
const=kernel.const_size_bytes
mbpt=kernel.max_threads_per_block
print("=MEM=\nLocal:%d,\nShared:%d,\nRegisters:%d,\nConst:%d,\nMax Threads/B:%d" % (local,shared,regs,const,mbpt))
示例输出
=MEM=
Local:24,
Shared:64,
Registers:18,
Const:0,
Max Threads/B:512
静态设备信息
即将展示的简单代码
import pycuda.autoinit
import pycuda.driver as cuda
(free,total)=cuda.mem_get_info()
print("Global memory occupancy:%f%% free"%(free*100/total))
for devicenum in range(cuda.Device.count()):
device=cuda.Device(devicenum)
attrs=device.get_attributes()
#Beyond this point is just pretty printing
print("\n===Attributes for device %d"%devicenum)
for (key,value) in attrs.iteritems():
print("%s:%s"%(str(key),str(value)))
示例输出
Global memory occupancy:70.000000% free
===Attributes for device 0
MAX_THREADS_PER_BLOCK:512
MAX_BLOCK_DIM_X:512
MAX_BLOCK_DIM_Y:512
MAX_BLOCK_DIM_Z:64
MAX_GRID_DIM_X:65535
MAX_GRID_DIM_Y:65535
MAX_GRID_DIM_Z:1
MAX_SHARED_MEMORY_PER_BLOCK:16384
TOTAL_CONSTANT_MEMORY:65536
WARP_SIZE:32
MAX_PITCH:2147483647
MAX_REGISTERS_PER_BLOCK:8192
CLOCK_RATE:1500000
TEXTURE_ALIGNMENT:256
GPU_OVERLAP:1
MULTIPROCESSOR_COUNT:14
KERNEL_EXEC_TIMEOUT:1
INTEGRATED:0
CAN_MAP_HOST_MEMORY:1
COMPUTE_MODE:DEFAULT
MAXIMUM_TEXTURE1D_WIDTH:8192
MAXIMUM_TEXTURE2D_WIDTH:65536
MAXIMUM_TEXTURE2D_HEIGHT:32768
MAXIMUM_TEXTURE3D_WIDTH:2048
MAXIMUM_TEXTURE3D_HEIGHT:2048
MAXIMUM_TEXTURE3D_DEPTH:2048
MAXIMUM_TEXTURE2D_ARRAY_WIDTH:8192
MAXIMUM_TEXTURE2D_ARRAY_HEIGHT:8192
MAXIMUM_TEXTURE2D_ARRAY_NUMSLICES:512
SURFACE_ALIGNMENT:256
CONCURRENT_KERNELS:0
ECC_ENABLED:0
PCI_BUS_ID:1
PCI_DEVICE_ID:0
TCC_DRIVER:0