CUDA Wrapper

Module: polygraphy.cuda

class MemcpyKind[source]

Bases: object

Enumerates different kinds of copy operations.

HostToHost = c_int(0): Copies from host memory to host memory

HostToDevice = c_int(1): Copies from host memory to device memory

DeviceToHost = c_int(2): Copies from device memory to host memory

DeviceToDevice = c_int(3): Copies from device memory to device memory

class Cuda[source]

Bases: object

NOTE: Do not construct this class manually. Instead, use the wrapper() function to get the global wrapper.

Wrapper that exposes low-level CUDA functionality.

malloc(nbytes)[source]

Allocates memory on the GPU.

Parameters: nbytes (int) – The number of bytes to allocate.
Returns: The memory address of the allocated region, i.e. a device pointer.
Return type: int
Raises: PolygraphyException – If an error was encountered during the allocation.

free(ptr)[source]

Frees memory allocated on the GPU.

Parameters: ptr (int) – The memory address, i.e. a device pointer.
Raises: PolygraphyException – If an error was encountered during the free.

memcpy(dst, src, nbytes, kind, stream_ptr=None)[source]

Copies data between host and device memory.

Parameters

dst (int) – The memory address of the destination, i.e. a pointer.
src (int) – The memory address of the source, i.e. a pointer.
nbytes (int) – The number of bytes to copy.
kind (MemcpyKind) – The kind of copy to perform.
stream_ptr (int) – The memory address of a CUDA stream, i.e. a pointer. If this is not provided, a synchronous copy is performed.

Raises

PolygraphyException – If an error was encountered during the copy.

wrapper()[source]

Returns the global Polygraphy CUDA wrapper.

Returns: The global CUDA wrapper.
Return type: Cuda

class Stream[source]

Bases: object

High-level wrapper for a CUDA stream.

ptr

The memory address of the underlying CUDA stream

Type: int

__exit__(exc_type, exc_value, traceback)[source]: Frees the underlying CUDA stream.

free()[source]

Frees the underlying CUDA stream.

You can also use a context manager to manage the stream lifetime. For example:

with Stream() as stream:
    ...

synchronize()[source]: Synchronizes the stream.

class DeviceView(ptr, shape, dtype)[source]

Bases: object

A read-only view of a GPU memory region.

Parameters

ptr (int) – A pointer to the region of memory.
shape (Tuple[int]) – The shape of the region.
dtype (numpy.dtype) – The data type of the region.

ptr

The memory address of the underlying GPU memory

Type: int

shape

The shape of the device buffer

Type: Tuple[int]

property dtype

The data type of the device buffer

Type: np.dtype

property nbytes: The number of bytes in the memory region.

copy_to(host_buffer, stream=None)[source]

Copies from this device buffer to the provided host buffer.

Parameters

host_buffer (numpy.ndarray) – The host buffer to copy into. The buffer must be contiguous in memory (see np.ascontiguousarray) and large enough to accomodate the device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.

Returns

The host buffer

Return type

np.ndarray

numpy()[source]

Create a new NumPy array containing the contents of this device buffer.

Returns: The newly created NumPy array.
Return type: np.ndarray

class DeviceArray(shape=None, dtype=None)[source]

Bases: polygraphy.cuda.cuda.DeviceView

An array on the GPU.

Parameters

shape (Tuple[int]) – The initial shape of the buffer.
dtype (numpy.dtype) – The data type of the buffer.

copy_to(host_buffer, stream=None)

Copies from this device buffer to the provided host buffer.

Parameters

host_buffer (numpy.ndarray) – The host buffer to copy into. The buffer must be contiguous in memory (see np.ascontiguousarray) and large enough to accomodate the device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.

Returns

The host buffer

Return type

np.ndarray

property nbytes: The number of bytes in the memory region.

numpy()

Create a new NumPy array containing the contents of this device buffer.

Returns: The newly created NumPy array.
Return type: np.ndarray

ptr

The memory address of the underlying GPU memory

Type: int

shape

The shape of the device buffer

Type: Tuple[int]

static raw(shape)[source]

Creates an untyped device array of the specified shape.

Parameters: shape (Tuple[int]) – The initial shape of the buffer, in units of bytes. For example, a shape of (4, 4) would allocate a 16 byte array.
Returns: The raw device array.
Return type: DeviceArray

resize(shape)[source]

Resizes or reshapes the array to the specified shape.

If the allocated memory region is already large enough, no reallocation is performed.

Parameters: shape (Tuple[int]) – The new shape.

__exit__(exc_type, exc_value, traceback)[source]: Frees the underlying memory of this DeviceArray.

free()[source]

Frees the GPU memory associated with this array.

You can also use a context manager to ensure that memory is freed. For example:

with DeviceArray(...) as arr:
    ...

copy_from(host_buffer, stream=None)[source]

Copies from the provided host buffer into this device buffer.

Parameters

host_buffer (numpy.ndarray) – The host buffer to copy from. The buffer must be contiguous in memory (see np.ascontiguousarray) and not larger than this device buffer.
stream (Stream) – A Stream instance. Performs a synchronous copy if no stream is provided.

Returns

self

Return type

DeviceArray

view(shape=None, dtype=None)[source]

Creates a read-only DeviceView from this DeviceArray.

Parameters

shape (Sequence[int]) – The desired shape of the view. Defaults to the shape of this array or view.
dtype (numpy.dtype) – The desired data type of the view. Defaults to the data type of this array or view.

Returns

A view of this arrays data on the device.

Return type

DeviceView