nvidia.dali.fn.crop_mirror_normalize#

nvidia.dali.fn.crop_mirror_normalize(__input, /, *, bytes_per_sample_hint=[0], crop=None, crop_d=0.0, crop_h=0.0, crop_pos_x=0.5, crop_pos_y=0.5, crop_pos_z=0.5, crop_w=0.0, dtype=DALIDataType.FLOAT, fill_values=[0.0], mean=[0.0], mirror=0, out_of_bounds_policy='error', output_layout='CHW', pad_output=False, preserve=False, rounding='round', scale=1.0, seed=-1, shift=0.0, std=[1.0], device=None, name=None)#

Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.

Normalization takes the input images and produces the output by using the following formula:

output = scale * (input - mean) / std + shift

Note

If no cropping arguments are specified, only mirroring and normalization will occur.

This operator allows sequence inputs and supports volumetric data.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

__input (TensorList ('HWC', 'CHW', 'DHWC', 'CDHW', 'FHWC', 'FCHW', 'CFHW', 'FDHWC', 'FCDHW', 'CFDHW')) – Input to the operator.

Keyword Arguments:
  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • crop (float or list of float or TensorList of float, optional) –

    Shape of the cropped image, specified as a list of values (for example, (crop_H, crop_W) for the 2D crop and (crop_D, crop_H, crop_W) for the volumetric crop).

    Providing crop argument is incompatible with providing separate arguments such as crop_d, crop_h, and crop_w.

  • crop_d (float or TensorList of float, optional, default = 0.0) –

    Applies only to volumetric inputs; cropping window depth (in voxels).

    crop_w, crop_h, and crop_d must be specified together. Providing values for crop_w, crop_h, and crop_d is incompatible with providing the fixed crop window dimensions (argument crop).

  • crop_h (float or TensorList of float, optional, default = 0.0) –

    Cropping the window height (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • crop_pos_x (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner).

    The actual position is calculated as crop_x = crop_x_norm * (W - crop_W), where crop_x_norm is the normalized position, W is the width of the image, and crop_W is the width of the cropping window.

    See rounding argument for more details on how crop_x is converted to an integral value.

  • crop_pos_y (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner).

    The actual position is calculated as crop_y = crop_y_norm * (H - crop_H), where crop_y_norm is the normalized position, H is the height of the image, and crop_H is the height of the cropping window.

    See rounding argument for more details on how crop_y is converted to an integral value.

  • crop_pos_z (float or TensorList of float, optional, default = 0.5) –

    Applies only to volumetric inputs.

    Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as crop_z = crop_z_norm * (D - crop_D), where crop_z_norm is the normalized position, D is the depth of the image and crop_D is the depth of the cropping window.

    See rounding argument for more details on how crop_z is converted to an integral value.

  • crop_w (float or TensorList of float, optional, default = 0.0) –

    Cropping window width (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) –

    Output data type.

    Supported types: FLOAT, FLOAT16, INT8, UINT8.

  • fill_values (float or list of float, optional, default = [0.0]) –

    Determines padding values and is only relevant if out_of_bounds_policy is set to “pad”.

    If a scalar value is provided, it will be used for all the channels. If multiple values are provided, the number of values and channels must be identical (extent of dimension C in the layout) in the output slice.

  • image_type (nvidia.dali.types.DALIImageType) –

    Warning

    The argument image_type is no longer used and will be removed in a future release.

  • mean (float or list of float or TensorList of float, optional, default = [0.0]) – Mean pixel values for image normalization.

  • mirror (int or TensorList of int, optional, default = 0) – If nonzero, the image will be flipped (mirrored) horizontally.

  • out_of_bounds_policy (str, optional, default = ‘error’) –

    Determines the policy when slicing the out of bounds area of the input.

    Here is a list of the supported values:

    • "error" (default): Attempting to slice outside of the bounds of the input will produce an error.

    • "pad": The input will be padded as needed with zeros or any other value that is specified with the fill_values argument.

    • "trim_to_shape": The slice window will be cut to the bounds of the input.

  • output_layout (layout str, optional, default = ‘CHW’) – Tensor data layout for the output.

  • pad_output (bool, optional, default = False) –

    Determines whether to pad the output so that the number of channels is a power of 2.

    The value used for padding is determined by the fill_values argument.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • rounding (str, optional, default = ‘round’) –

    Determines the rounding function used to convert the starting coordinate of the window to an integral value (see crop_pos_x, crop_pos_y, crop_pos_z).

    Possible values are:

    • "round" - Rounds to the nearest integer value, with halfway cases rounded away from zero.
    • "truncate" - Discards the fractional part of the number (truncates towards zero).

  • scale (float, optional, default = 1.0) –

    The value by which the result is multiplied.

    This argument is useful when using integer outputs to improve dynamic range utilization.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • shift (float, optional, default = 0.0) –

    The value added to the (scaled) result.

    This argument is useful when using unsigned integer outputs to improve dynamic range utilization.

  • std (float or list of float or TensorList of float, optional, default = [1.0]) – Standard deviation values for image normalization.

  • output_dtype (nvidia.dali.types.DALIDataType) –

    Warning

    The argument output_dtype is a deprecated alias for dtype. Use dtype instead.