nvidia.dali.fn.crop_mirror_normalize#
- nvidia.dali.fn.crop_mirror_normalize(__input, /, *, bytes_per_sample_hint=[0], crop=None, crop_d=0.0, crop_h=0.0, crop_pos_x=0.5, crop_pos_y=0.5, crop_pos_z=0.5, crop_w=0.0, dtype=DALIDataType.FLOAT, fill_values=[0.0], mean=[0.0], mirror=0, out_of_bounds_policy='error', output_layout='CHW', pad_output=False, preserve=False, rounding='round', scale=1.0, seed=-1, shift=0.0, std=[1.0], device=None, name=None)#
Performs fused cropping, normalization, format conversion (NHWC to NCHW) if desired, and type casting.
Normalization takes the input images and produces the output by using the following formula:
output = scale * (input - mean) / std + shift
Note
If no cropping arguments are specified, only mirroring and normalization will occur.
This operator allows sequence inputs and supports volumetric data.
- Supported backends
‘cpu’
‘gpu’
- Parameters:
__input (TensorList ('HWC', 'CHW', 'DHWC', 'CDHW', 'FHWC', 'FCHW', 'CFHW', 'FDHWC', 'FCDHW', 'CFDHW')) – Input to the operator.
- Keyword Arguments:
bytes_per_sample_hint (int or list of int, optional, default = [0]) –
Output size hint, in bytes per sample.
If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.
crop (float or list of float or TensorList of float, optional) –
Shape of the cropped image, specified as a list of values (for example,
(crop_H, crop_W)
for the 2D crop and(crop_D, crop_H, crop_W)
for the volumetric crop).Providing crop argument is incompatible with providing separate arguments such as
crop_d
,crop_h
, andcrop_w
.crop_d (float or TensorList of float, optional, default = 0.0) –
Applies only to volumetric inputs; cropping window depth (in voxels).
crop_w
,crop_h
, andcrop_d
must be specified together. Providing values forcrop_w
,crop_h
, andcrop_d
is incompatible with providing the fixed crop window dimensions (argument crop).crop_h (float or TensorList of float, optional, default = 0.0) –
Cropping the window height (in pixels).
Providing values for
crop_w
andcrop_h
is incompatible with providing fixed crop window dimensions (argumentcrop
).crop_pos_x (float or TensorList of float, optional, default = 0.5) –
Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner).
The actual position is calculated as
crop_x = crop_x_norm * (W - crop_W)
, where crop_x_norm is the normalized position,W
is the width of the image, andcrop_W
is the width of the cropping window.See
rounding
argument for more details on howcrop_x
is converted to an integral value.crop_pos_y (float or TensorList of float, optional, default = 0.5) –
Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner).
The actual position is calculated as
crop_y = crop_y_norm * (H - crop_H)
, wherecrop_y_norm
is the normalized position, H is the height of the image, andcrop_H
is the height of the cropping window.See
rounding
argument for more details on howcrop_y
is converted to an integral value.crop_pos_z (float or TensorList of float, optional, default = 0.5) –
Applies only to volumetric inputs.
Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as
crop_z = crop_z_norm * (D - crop_D)
, wherecrop_z_norm
is the normalized position,D
is the depth of the image andcrop_D
is the depth of the cropping window.See
rounding
argument for more details on howcrop_z
is converted to an integral value.crop_w (float or TensorList of float, optional, default = 0.0) –
Cropping window width (in pixels).
Providing values for
crop_w
andcrop_h
is incompatible with providing fixed crop window dimensions (argumentcrop
).dtype (
nvidia.dali.types.DALIDataType
, optional, default = DALIDataType.FLOAT) –Output data type.
Supported types:
FLOAT
,FLOAT16
,INT8
,UINT8
.fill_values (float or list of float, optional, default = [0.0]) –
Determines padding values and is only relevant if
out_of_bounds_policy
is set to “pad”.If a scalar value is provided, it will be used for all the channels. If multiple values are provided, the number of values and channels must be identical (extent of dimension
C
in the layout) in the output slice.image_type (
nvidia.dali.types.DALIImageType
) –Warning
The argument
image_type
is no longer used and will be removed in a future release.mean (float or list of float or TensorList of float, optional, default = [0.0]) – Mean pixel values for image normalization.
mirror (int or TensorList of int, optional, default = 0) – If nonzero, the image will be flipped (mirrored) horizontally.
out_of_bounds_policy (str, optional, default = ‘error’) –
Determines the policy when slicing the out of bounds area of the input.
Here is a list of the supported values:
"error"
(default): Attempting to slice outside of the bounds of the input will produce an error."pad"
: The input will be padded as needed with zeros or any other value that is specified with thefill_values
argument."trim_to_shape"
: The slice window will be cut to the bounds of the input.
output_layout (layout str, optional, default = ‘CHW’) – Tensor data layout for the output.
pad_output (bool, optional, default = False) –
Determines whether to pad the output so that the number of channels is a power of 2.
The value used for padding is determined by the
fill_values
argument.preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.
rounding (str, optional, default = ‘round’) –
Determines the rounding function used to convert the starting coordinate of the window to an integral value (see
crop_pos_x
,crop_pos_y
,crop_pos_z
).Possible values are:
"round"
- Rounds to the nearest integer value, with halfway cases rounded away from zero."truncate"
- Discards the fractional part of the number (truncates towards zero).
scale (float, optional, default = 1.0) –
The value by which the result is multiplied.
This argument is useful when using integer outputs to improve dynamic range utilization.
seed (int, optional, default = -1) –
Random seed.
If not provided, it will be populated based on the global seed of the pipeline.
shift (float, optional, default = 0.0) –
The value added to the (scaled) result.
This argument is useful when using unsigned integer outputs to improve dynamic range utilization.
std (float or list of float or TensorList of float, optional, default = [1.0]) – Standard deviation values for image normalization.
output_dtype (
nvidia.dali.types.DALIDataType
) –Warning
The argument
output_dtype
is a deprecated alias fordtype
. Usedtype
instead.