nvidia.dali.fn.resize_crop_mirror

nvidia.dali.fn.resize_crop_mirror(*inputs, **kwargs)

Performs a fused resize, crop, mirror operation. Both fixed and random resizing and cropping are supported.

Supported backends
  • ‘cpu’

Parameters:

input (TensorList ('HWC')) – Input to the operator.

Keyword Arguments:
  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • crop (float or list of float or TensorList of float, optional) –

    Shape of the cropped image, specified as a list of values (for example, (crop_H, crop_W) for the 2D crop and (crop_D, crop_H, crop_W) for the volumetric crop).

    Providing crop argument is incompatible with providing separate arguments such as crop_d, crop_h, and crop_w.

  • crop_d (float or TensorList of float, optional, default = 0.0) –

    Applies only to volumetric inputs; cropping window depth (in voxels).

    crop_w, crop_h, and crop_d must be specified together. Providing values for crop_w, crop_h, and crop_d is incompatible with providing the fixed crop window dimensions (argument crop).

  • crop_h (float or TensorList of float, optional, default = 0.0) –

    Cropping the window height (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • crop_pos_x (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner).

    The actual position is calculated as crop_x = crop_x_norm * (W - crop_W), where crop_x_norm is the normalized position, W is the width of the image, and crop_W is the width of the cropping window.

    See rounding argument for more details on how crop_x is converted to an integral value.

  • crop_pos_y (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner).

    The actual position is calculated as crop_y = crop_y_norm * (H - crop_H), where crop_y_norm is the normalized position, H is the height of the image, and crop_H is the height of the cropping window.

    See rounding argument for more details on how crop_y is converted to an integral value.

  • crop_pos_z (float or TensorList of float, optional, default = 0.5) –

    Applies only to volumetric inputs.

    Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as crop_z = crop_z_norm * (D - crop_D), where crop_z_norm is the normalized position, D is the depth of the image and crop_D is the depth of the cropping window.

    See rounding argument for more details on how crop_z is converted to an integral value.

  • crop_w (float or TensorList of float, optional, default = 0.0) –

    Cropping window width (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • dtype (nvidia.dali.types.DALIDataType, optional) –

    Output data type.

    Supported types: FLOAT, FLOAT16, and UINT8.

    If not set, the input type is used.

  • fill_values (float or list of float, optional, default = [0.0]) –

    Determines padding values and is only relevant if out_of_bounds_policy is set to “pad”.

    If a scalar value is provided, it will be used for all the channels. If multiple values are provided, the number of values and channels must be identical (extent of dimension C in the layout) in the output slice.

  • image_type (nvidia.dali.types.DALIImageType) –

    Warning

    The argument image_type is no longer used and will be removed in a future release.

  • interp_type (nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Type of interpolation used.

  • max_size (float or list of float, optional) –

    Limit of the output size.

    When the operator is configured to keep aspect ratio and only the smaller dimension is specified, the other(s) can grow very large. This can happen when using resize_shorter argument or “not_smaller” mode or when some extents are left unspecified.

    This parameter puts a limit to how big the output can become. This value can be specified per-axis or uniformly for all axes.

    Note

    When used with “not_smaller” mode or resize_shorter argument, max_size takes precedence and the aspect ratio is kept - for example, resizing with mode="not_smaller", size=800, max_size=1400 an image of size 1200x600 would be resized to 1400x700.

  • mirror (int or TensorList of int, optional, default = 0) –

    Mask for the horizontal flip.

    Supported values:

    • 0 - Do not perform horizontal flip for this image.

    • 1 - Performs horizontal flip for this image.

  • mode (str, optional, default = ‘default’) –

    Resize mode.

    Here is a list of supported modes:

    • "default" - image is resized to the specified size.
      Missing extents are scaled with the average scale of the provided ones.
    • "stretch" - image is resized to the specified size.
      Missing extents are not scaled at all.
    • "not_larger" - image is resized, keeping the aspect ratio, so that no extent of the output image exceeds the specified size.
      For example, a 1280x720, with a desired output size of 640x480, actually produces a 640x360 output.
    • "not_smaller" - image is resized, keeping the aspect ratio, so that no extent of the output image is smaller than specified.
      For example, a 640x480 image with a desired output size of 1920x1080, actually produces a 1920x1440 output.

      This argument is mutually exclusive with resize_longer and resize_shorter

  • out_of_bounds_policy (str, optional, default = ‘error’) –

    Determines the policy when slicing the out of bounds area of the input.

    Here is a list of the supported values:

    • "error" (default): Attempting to slice outside of the bounds of the input will produce an error.

    • "pad": The input will be padded as needed with zeros or any other value that is specified with the fill_values argument.

    • "trim_to_shape": The slice window will be cut to the bounds of the input.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • resize_longer (float or TensorList of float, optional, default = 0.0) –

    The length of the longer dimension of the resized image.

    This option is mutually exclusive with resize_shorter and explicit size arguments, and the operator keeps the aspect ratio of the original image. This option is equivalent to specifying the same size for all dimensions and mode="not_larger".

  • resize_shorter (float or TensorList of float, optional, default = 0.0) –

    The length of the shorter dimension of the resized image.

    This option is mutually exclusive with resize_longer and explicit size arguments, and the operator keeps the aspect ratio of the original image. This option is equivalent to specifying the same size for all dimensions and mode="not_smaller". The longer dimension can be bounded by setting the max_size argument. See max_size argument doc for more info.

  • resize_x (float or TensorList of float, optional, default = 0.0) –

    The length of the X dimension of the resized image.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_y is unspecified or 0, the operator keeps the aspect ratio of the original image. A negative value flips the image.

  • resize_y (float or TensorList of float, optional, default = 0.0) –

    The length of the Y dimension of the resized image.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_x is unspecified or 0, the operator keeps the aspect ratio of the original image. A negative value flips the image.

  • resize_z (float or TensorList of float, optional, default = 0.0) –

    The length of the Z dimension of the resized volume.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_x and resize_y are left unspecified or 0, then the op will keep the aspect ratio of the original volume. Negative value flips the volume.

  • roi_end (float or list of float or TensorList of float, optional) –

    End of the input region of interest (ROI).

    Must be specified together with roi_start. The coordinates follow the tensor shape order, which is the same as size. The coordinates can be either absolute (in pixels, which is the default) or relative (0..1), depending on the value of relative_roi argument. If the ROI origin is greater than the ROI end in any dimension, the region is flipped in that dimension.

  • roi_relative (bool, optional, default = False) – If true, ROI coordinates are relative to the input size, where 0 denotes top/left and 1 denotes bottom/right

  • roi_start (float or list of float or TensorList of float, optional) –

    Origin of the input region of interest (ROI).

    Must be specified together with roi_end. The coordinates follow the tensor shape order, which is the same as size. The coordinates can be either absolute (in pixels, which is the default) or relative (0..1), depending on the value of relative_roi argument. If the ROI origin is greater than the ROI end in any dimension, the region is flipped in that dimension.

  • rounding (str, optional, default = ‘round’) –

    Determines the rounding function used to convert the starting coordinate of the window to an integral value (see crop_pos_x, crop_pos_y, crop_pos_z).

    Possible values are:

    • "round" - Rounds to the nearest integer value, with halfway cases rounded away from zero.
    • "truncate" - Discards the fractional part of the number (truncates towards zero).

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • size (float or list of float or TensorList of float, optional) –

    The desired output size.

    Must be a list/tuple with one entry per spatial dimension, excluding video frames and channels. Dimensions with a 0 extent are treated as absent, and the output size will be calculated based on other extents and mode argument.

  • subpixel_scale (bool, optional, default = True) –

    If True, fractional sizes, directly specified or calculated, will cause the input ROI to be adjusted to keep the scale factor.

    Otherwise, the scale factor will be adjusted so that the source image maps to the rounded output size.

  • output_dtype (nvidia.dali.types.DALIDataType) –

    Warning

    The argument output_dtype is a deprecated alias for dtype. Use dtype instead.