nvidia.dali.fn.crop_mirror_normalize#

nvidia.dali.fn.crop_mirror_normalize( __input, /, *, bytes_per_sample_hint=[0], crop=None, crop_d=0.0, crop_h=0.0, crop_pos_x=0.5, crop_pos_y=0.5, crop_pos_z=0.5, crop_w=0.0, dtype=DALIDataType.FLOAT, fill_values=[0.0], mean=[0.0], mirror=0, out_of_bounds_policy='error', output_layout='CHW', pad_output=False, preserve=False, rounding='round', scale=1.0, seed=-1, shift=0.0, std=[1.0], device=None, name=None, )#

Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.

Normalization takes the input images and produces the output by using the following formula:

output = scale * (input - mean) / std + shift

Note

If no cropping arguments are specified, only mirroring and normalization will occur.

This operator allows sequence inputs and supports volumetric data.

Supported backends

‘cpu’
‘gpu’

Parameters:

__input (TensorList ('HWC', 'CHW', 'DHWC', 'CDHW', 'FHWC', 'FCHW', 'CFHW', 'FDHWC', 'FCDHW', 'CFDHW')) – Input to the operator.

Keyword Arguments:

bytes_per_sample_hint (int or list of int, optional, default = [0]) –
Output size hint, in bytes per sample.

If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.
crop (float or list of float or TensorList of float, optional) –
Shape of the cropped image, specified as a list of values (for example, (crop_H, crop_W) for the 2D crop and (crop_D, crop_H, crop_W) for the volumetric crop).

Providing crop argument is incompatible with providing separate arguments such as crop_d, crop_h, and crop_w.
crop_d (float or TensorList of float, optional, default = 0.0) –
Applies only to volumetric inputs; cropping window depth (in voxels).

crop_w, crop_h, and crop_d must be specified together. Providing values for crop_w, crop_h, and crop_d is incompatible with providing the fixed crop window dimensions (argument crop).
crop_h (float or TensorList of float, optional, default = 0.0) –
Cropping the window height (in pixels).

Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).
crop_pos_x (float or TensorList of float, optional, default = 0.5) –
Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner).

The actual position is calculated as crop_x = crop_x_norm * (W - crop_W), where crop_x_norm is the normalized position, W is the width of the image, and crop_W is the width of the cropping window.

See rounding argument for more details on how crop_x is converted to an integral value.
crop_pos_y (float or TensorList of float, optional, default = 0.5) –
Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner).

The actual position is calculated as crop_y = crop_y_norm * (H - crop_H), where crop_y_norm is the normalized position, H is the height of the image, and crop_H is the height of the cropping window.

See rounding argument for more details on how crop_y is converted to an integral value.
crop_pos_z (float or TensorList of float, optional, default = 0.5) –
Applies only to volumetric inputs.

Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as crop_z = crop_z_norm * (D - crop_D), where crop_z_norm is the normalized position, D is the depth of the image and crop_D is the depth of the cropping window.

See rounding argument for more details on how crop_z is converted to an integral value.
crop_w (float or TensorList of float, optional, default = 0.0) –
Cropping window width (in pixels).

Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).
dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.FLOAT) –
Output data type.

Supported types: FLOAT, FLOAT16, INT8, UINT8.
fill_values (float or list of float, optional, default = [0.0]) –
Determines padding values and is only relevant if out_of_bounds_policy is set to “pad”.

If a scalar value is provided, it will be used for all the channels. If multiple values are provided, the number of values and channels must be identical (extent of dimension C in the layout) in the output slice.
image_type (nvidia.dali.types.DALIImageType) –

Warning

The argument image_type is no longer used and will be removed in a future release.
mean (float or list of float or TensorList of float, optional, default = [0.0]) – Mean pixel values for image normalization.
mirror (int or TensorList of int, optional, default = 0) – If nonzero, the image will be flipped (mirrored) horizontally.
out_of_bounds_policy (str, optional, default = ‘error’) –
Determines the policy when slicing the out of bounds area of the input.

Here is a list of the supported values:
- "error" (default): Attempting to slice outside of the bounds of the input will produce an error.
- "pad": The input will be padded as needed with zeros or any other value that is specified with the fill_values argument.
- "trim_to_shape": The slice window will be cut to the bounds of the input.

:keyword output_layout : layout str, optional, default = ‘CHW’: Tensor data layout for the output. :keyword pad_output: Determines whether to pad the output so that the number of channels is a power of 2.

The value used for padding is determined by the fill_values argument.

Keyword Arguments:

preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.
rounding (str, optional, default = ‘round’) –
Determines the rounding function used to convert the starting coordinate of the window to an integral value (see crop_pos_x, crop_pos_y, crop_pos_z).

Possible values are:
- "round" - Rounds to the nearest integer value, with halfway cases rounded away from zero.
- "truncate" - Discards the fractional part of the number (truncates towards zero).
scale (float, optional, default = 1.0) –
The value by which the result is multiplied.

This argument is useful when using integer outputs to improve dynamic range utilization.
seed (int, optional, default = -1) –
Random seed.

If not provided, it will be populated based on the global seed of the pipeline.
shift (float, optional, default = 0.0) –
The value added to the (scaled) result.

This argument is useful when using unsigned integer outputs to improve dynamic range utilization.
std (float or list of float or TensorList of float, optional, default = [1.0]) – Standard deviation values for image normalization.
output_dtype (nvidia.dali.types.DALIDataType) –

Warning

The argument output_dtype is a deprecated alias for dtype. Use dtype instead.