tritonclient.utils.cuda_shared_memory#
Functions
Return all cuda shared memory regions that were allocated but not freed. |
|
|
|
|
Creates a shared memory region with the specified name and size. |
|
Close a cuda shared memory region with the specified handle. |
|
Generates a numpy array using the data stored in the cuda shared memory region specified with the handle. |
|
Returns the underlying raw serialized cudaIPC handle in base64 encoding. |
|
Copy the contents of the numpy array into the cuda shared memory region. |
- tritonclient.utils.cuda_shared_memory._get_or_create_global_cuda_stream(device_id)#
- tritonclient.utils.cuda_shared_memory._is_device_supported(device: DLDevice)#
- tritonclient.utils.cuda_shared_memory._support_uva(shm_device_id, ext_device_id)#
- tritonclient.utils.cuda_shared_memory.allocated_shared_memory_regions()#
Return all cuda shared memory regions that were allocated but not freed.
- Returns:
The list of cuda shared memory handles corresponding to the allocated regions.
- Return type:
list
- tritonclient.utils.cuda_shared_memory.as_shared_memory_tensor(cuda_shm_handle, datatype, shape)#
- tritonclient.utils.cuda_shared_memory.create_shared_memory_region(triton_shm_name, byte_size, device_id)#
Creates a shared memory region with the specified name and size.
- Parameters:
triton_shm_name (str) – The unique name of the cuda shared memory region to be created.
byte_size (int) – The size in bytes of the cuda shared memory region to be created.
device_id (int) – The GPU device ID of the cuda shared memory region to be created.
- Returns:
cuda_shm_handle – The handle for the cuda shared memory region.
- Return type:
CudaSharedMemoryRegion
- Raises:
CudaSharedMemoryException – If unable to create the cuda shared memory region on the specified device.
- tritonclient.utils.cuda_shared_memory.destroy_shared_memory_region(cuda_shm_handle)#
Close a cuda shared memory region with the specified handle.
- Parameters:
cuda_shm_handle (CudaSharedMemoryRegion) – The handle for the cuda shared memory region.
- Raises:
CudaSharedMemoryException – If unable to close the cuda shared memory region and free the device memory.
- tritonclient.utils.cuda_shared_memory.get_contents_as_numpy(cuda_shm_handle, datatype, shape)#
Generates a numpy array using the data stored in the cuda shared memory region specified with the handle.
- Parameters:
cuda_shm_handle (CudaSharedMemoryRegion) – The handle for the cuda shared memory region.
datatype (np.dtype) – The datatype of the array to be returned.
shape (list) – The list of int describing the shape of the array to be returned.
- Returns:
The numpy array generated using contents from the specified shared memory region.
- Return type:
np.array
- tritonclient.utils.cuda_shared_memory.get_raw_handle(cuda_shm_handle)#
Returns the underlying raw serialized cudaIPC handle in base64 encoding.
- Parameters:
cuda_shm_handle (CudaSharedMemoryRegion) – The handle for the cuda shared memory region.
- Returns:
The raw serialized cudaIPC handle of underlying cuda shared memory in base64 encoding.
- Return type:
bytes
- tritonclient.utils.cuda_shared_memory.set_shared_memory_region(cuda_shm_handle, input_values)#
Copy the contents of the numpy array into the cuda shared memory region.
- Parameters:
cuda_shm_handle (CudaSharedMemoryRegion) – The handle for the cuda shared memory region.
input_values (list) – The list of numpy arrays to be copied into the shared memory region.
- Raises:
CudaSharedMemoryException – If unable to set values in the cuda shared memory region.
- tritonclient.utils.cuda_shared_memory.set_shared_memory_region_from_dlpack(cuda_shm_handle, input_values)#