CUDA Wrapper
Module: polygraphy.cuda
- class MemcpyKind[source]
Bases:
object
Enumerates different kinds of copy operations.
- HostToHost = c_int(0)
Copies from host memory to host memory
- HostToDevice = c_int(1)
Copies from host memory to device memory
- DeviceToHost = c_int(2)
Copies from device memory to host memory
- DeviceToDevice = c_int(3)
Copies from device memory to device memory
- class Cuda[source]
Bases:
object
NOTE: Do not construct this class manually. Instead, use the
wrapper()
function to get the global wrapper.Wrapper that exposes low-level CUDA functionality.
- malloc(nbytes)[source]
Allocates memory on the GPU.
- Parameters:
nbytes (int) – The number of bytes to allocate.
- Returns:
The memory address of the allocated region, i.e. a device pointer.
- Return type:
int
- Raises:
PolygraphyException – If an error was encountered during the allocation.
- free(ptr)[source]
Frees memory allocated on the GPU.
- Parameters:
ptr (int) – The memory address, i.e. a device pointer.
- Raises:
PolygraphyException – If an error was encountered during the free.
- memcpy(dst, src, nbytes, kind, stream_ptr=None)[source]
Copies data between host and device memory.
- Parameters:
dst (int) – The memory address of the destination, i.e. a pointer.
src (int) – The memory address of the source, i.e. a pointer.
nbytes (int) – The number of bytes to copy.
kind (MemcpyKind) – The kind of copy to perform.
stream_ptr (int) – The memory address of a CUDA stream, i.e. a pointer. If this is not provided, a synchronous copy is performed.
- Raises:
PolygraphyException – If an error was encountered during the copy.
- wrapper()[source]
Returns the global Polygraphy CUDA wrapper.
- Returns:
The global CUDA wrapper.
- Return type:
- class Stream[source]
Bases:
object
High-level wrapper for a CUDA stream.
- ptr
The memory address of the underlying CUDA stream
- Type:
int
- class DeviceView(ptr, shape, dtype)[source]
Bases:
object
A read-only view of a GPU memory region.
- Parameters:
ptr (int) – A pointer to the region of memory.
shape (Tuple[int]) – The shape of the region.
dtype (DataType) – The data type of the region.
- ptr
The memory address of the underlying GPU memory
- Type:
int
- shape
The shape of the device buffer
- Type:
Tuple[int]
- property nbytes
The number of bytes in the memory region.
- copy_to(host_buffer, stream=None)[source]
Copies from this device buffer to the provided host buffer.
- Parameters:
host_buffer (Union[numpy.ndarray, torch.Tensor]) – The host buffer to copy into. The buffer must be contiguous in memory (see np.ascontiguousarray or torch.Tensor.contiguous) and large enough to accomodate the device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.
- Returns:
The host buffer
- Return type:
np.ndarray
- class DeviceArray(shape=None, dtype=None)[source]
Bases:
DeviceView
An array on the GPU.
- Parameters:
shape (Tuple[int]) – The initial shape of the buffer.
dtype (DataType) – The data type of the buffer.
- copy_to(host_buffer, stream=None)
Copies from this device buffer to the provided host buffer.
- Parameters:
host_buffer (Union[numpy.ndarray, torch.Tensor]) – The host buffer to copy into. The buffer must be contiguous in memory (see np.ascontiguousarray or torch.Tensor.contiguous) and large enough to accomodate the device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.
- Returns:
The host buffer
- Return type:
np.ndarray
- property dtype
- property nbytes
The number of bytes in the memory region.
- numpy()
Create a new NumPy array containing the contents of this device buffer.
- Returns:
The newly created NumPy array.
- Return type:
np.ndarray
- ptr
The memory address of the underlying GPU memory
- Type:
int
- shape
The shape of the device buffer
- Type:
Tuple[int]
- static raw(shape=None)[source]
Creates an untyped device array of the specified shape.
- Parameters:
shape (Tuple[int]) – The initial shape of the buffer, in units of bytes. For example, a shape of
(4, 4)
would allocate a 16 byte array.- Returns:
The raw device array.
- Return type:
- resize(shape)[source]
Resizes or reshapes the array to the specified shape.
If the allocated memory region is already large enough, no reallocation is performed.
- Parameters:
shape (Tuple[int]) – The new shape.
- Returns:
self
- Return type:
- free()[source]
Frees the GPU memory associated with this array.
You can also use a context manager to ensure that memory is freed. For example:
with DeviceArray(...) as arr: ...
- copy_from(host_buffer, stream=None)[source]
Copies from the provided host buffer into this device buffer.
- Parameters:
host_buffer (Union[numpy.ndarray, torch.Tensor]) – The host buffer to copy from. The buffer must be contiguous in memory (see np.ascontiguousarray or torch.Tensor.contiguous) and not larger than this device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.
- Returns:
self
- Return type:
- view(shape=None, dtype=None)[source]
Creates a read-only DeviceView from this DeviceArray.
- Parameters:
shape (Sequence[int]) – The desired shape of the view. Defaults to the shape of this array or view.
dtype (DataType) – The desired data type of the view. Defaults to the data type of this array or view.
- Returns:
A view of this arrays data on the device.
- Return type: