nvidia.dali.fn.coord_transform

nvidia.dali.fn.coord_transform(*inputs, **kwargs)

Applies a linear transformation to points or vectors.

The transformation has the form:

out = M * in + T

Where M is a m x n matrix and T is a translation vector with m components. Input must consist of n-element vectors or points and the output has m components.

This operator can be used for many operations. Here’s the (incomplete) list:

  • applying affine transform to point clouds

  • projecting points onto a subspace

  • some color space conversions, for example RGB to YCbCr or grayscale

  • linear operations on colors, like hue rotation, brighness and contrast adjustment

This operator allows sequence inputs.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

input (TensorList) – Input to the operator.

Keyword Arguments:
  • M (float or list of float or TensorList of float, optional) –

    The matrix used for transforming the input vectors.

    If left unspecified, identity matrix is used.

    The matrix M does not need to be square - if it’s not, the output vectors will have a number of components equal to the number of rows in M.

    If a scalar value is provided, M is assumed to be a square matrix with that value on the diagonal. The size of the matrix is then assumed to match the number of components in the input vectors.

    Supports per-frame inputs.

  • MT (float or list of float or TensorList of float, optional) –

    A block matrix [M T] which combines the arguments M and T.

    Providing a scalar value for this argument is equivalent to providing the same scalar for M and leaving T unspecified.

    The number of columns must be one more than the number of components in the input. This argument is mutually exclusive with M and T.

    Supports per-frame inputs.

  • T (float or list of float or TensorList of float, optional) –

    The translation vector.

    If left unspecified, no translation is applied unless MT argument is used.

    The number of components of this vector must match the number of rows in matrix M. If a scalar value is provided, that value is broadcast to all components of T and the number of components is chosen to match the number of rows in M.

    Supports per-frame inputs.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) –

    Data type of the output coordinates.

    If an integral type is used, the output values are rounded to the nearest integer and clamped to the dynamic range of this type.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.