cuda.tile.gather#

cuda.tile.gather(
array,
indices,
/,
*,
mask=None,
padding_value=0,
check_bounds=True,
latency=None,
)#

Loads a tile from the array elements specified by indices.

indices must be a tuple whose length equals the array rank. All elements of this tuple must be integer tiles or scalars of the same shape, or different shapes that are broadcastable to a common shape.

The result shape will be the same as the broadcasted shape of indices.

For example, consider a 2-dimensional array. In this case, indices must be a tuple of length 2. Suppose that ind0 and ind1 are integer tiles of shapes (M, N, 1) and (M, 1, K). Then the result tile will have the broadcasted shape (M, N, K):

>>> t = ct.gather(array, (ind0, ind1))   # `t` has shape (M, N, K)

The result tile t will be computed according to

t[i, j, k] = array[ind0[i, j, 0], ind1[i, 0, k]]   (for all 0<=i<M, 0<=j<N, 0<=k<K)

If the array is 1-dimensional, indices can be passed as a tile rather than a tuple. This is a convenience notation that is strictly equivalent to passing a tuple of length 1:

>>> ct.gather(array, ind0)   # equivalent to ct.gather(array, (ind0,))

A custom boolean mask can be provided to control which elements are loaded. The mask must be a scalar or a tile whose shape is broadcastable to the common shape of indices. Where the mask is False, padding_value is returned instead of loading from the array.

gather() checks that indices are within the bounds of the array. For indices that are out of bounds, padding_value will be returned (zero by default). It must be a scalar or a tile whose shape is broadcastable to the common shape of indices.

If both mask and check_bounds=True are provided, the effective mask is the logical AND of both the custom mask and the bounds-checking mask. This means an element is only loaded if both the custom mask is True AND the index is within bounds.

To disable bounds checking, set check_bounds to False. In this mode, the caller is responsible for ensuring that all indices are within the bounds of the array, and any out-of-bounds access will result in undefined behavior.

Negative indices are interpreted as out of bounds, i.e. they don’t follow the Python’s negative index convention.