nvidia.dali.fn.resize_crop_mirror#

nvidia.dali.fn.resize_crop_mirror(
__input,
/,
*,
antialias=True,
bytes_per_sample_hint=[0],
crop=None,
crop_d=0.0,
crop_h=0.0,
crop_pos_x=0.5,
crop_pos_y=0.5,
crop_pos_z=0.5,
crop_w=0.0,
dtype=None,
interp_type=DALIInterpType.INTERP_LINEAR,
mag_filter=DALIInterpType.INTERP_LINEAR,
max_size=None,
min_filter=DALIInterpType.INTERP_LINEAR,
minibatch_size=32,
mirror=0,
mode='default',
preserve=False,
resize_longer=0.0,
resize_shorter=0.0,
resize_x=0.0,
resize_y=0.0,
resize_z=0.0,
roi_end=None,
roi_relative=False,
roi_start=None,
rounding='round',
seed=-1,
size=None,
subpixel_scale=True,
temp_buffer_hint=0,
device=None,
name=None,
)#

Performs a fused resize, crop, mirror operation.

The result of the operation is equivalent to applying resize, followed by crop and flip. Internally, the operator calculates the relevant region of interest and performs a single resizing operation on that region. .

This operator allows sequence inputs and supports volumetric data.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

__input (TensorList ('HWC', 'FHWC', 'CHW', 'FCHW', 'CFHW', 'DHWC', 'FDHWC', 'CDHW', 'FCDHW', 'CFDHW')) – Input to the operator.

Keyword Arguments:
  • antialias (bool, optional, default = True) –

    If enabled, it applies an antialiasing filter when scaling down.

    Note

    Nearest neighbor interpolation does not support antialiasing.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • crop (float or list of float or TensorList of float, optional) –

    Shape of the cropped image, specified as a list of values (for example, (crop_H, crop_W) for the 2D crop and (crop_D, crop_H, crop_W) for the volumetric crop).

    Providing crop argument is incompatible with providing separate arguments such as crop_d, crop_h, and crop_w.

  • crop_d (float or TensorList of float, optional, default = 0.0) –

    Applies only to volumetric inputs; cropping window depth (in voxels).

    crop_w, crop_h, and crop_d must be specified together. Providing values for crop_w, crop_h, and crop_d is incompatible with providing the fixed crop window dimensions (argument crop).

  • crop_h (float or TensorList of float, optional, default = 0.0) –

    Cropping the window height (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • crop_pos_x (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner).

    The actual position is calculated as crop_x = crop_x_norm * (W - crop_W), where crop_x_norm is the normalized position, W is the width of the image, and crop_W is the width of the cropping window.

    See rounding argument for more details on how crop_x is converted to an integral value.

  • crop_pos_y (float or TensorList of float, optional, default = 0.5) –

    Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner).

    The actual position is calculated as crop_y = crop_y_norm * (H - crop_H), where crop_y_norm is the normalized position, H is the height of the image, and crop_H is the height of the cropping window.

    See rounding argument for more details on how crop_y is converted to an integral value.

  • crop_pos_z (float or TensorList of float, optional, default = 0.5) –

    Applies only to volumetric inputs.

    Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as crop_z = crop_z_norm * (D - crop_D), where crop_z_norm is the normalized position, D is the depth of the image and crop_D is the depth of the cropping window.

    See rounding argument for more details on how crop_z is converted to an integral value.

  • crop_w (float or TensorList of float, optional, default = 0.0) –

    Cropping window width (in pixels).

    Providing values for crop_w and crop_h is incompatible with providing fixed crop window dimensions (argument crop).

  • dtype (nvidia.dali.types.DALIDataType, optional) –

    Output data type.

    Must be same as input type or float. If not set, input type is used.

  • interp_type (nvidia.dali.types.DALIInterpType or TensorList of nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) –

    Type of interpolation to be used.

    Use min_filter and mag_filter to specify different filtering for downscaling and upscaling.

    Note

    Usage of INTERP_TRIANGULAR is now deprecated and it should be replaced by a combination of

    INTERP_LINEAR with antialias enabled.

  • mag_filter (nvidia.dali.types.DALIInterpType or TensorList of nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Filter used when scaling up.

  • max_size (float or list of float, optional) –

    Limit of the output size.

    When the operator is configured to keep aspect ratio and only the smaller dimension is specified, the other(s) can grow very large. This can happen when using resize_shorter argument or “not_smaller” mode or when some extents are left unspecified.

    This parameter puts a limit to how big the output can become. This value can be specified per-axis or uniformly for all axes.

    Note

    When used with “not_smaller” mode or resize_shorter argument, max_size takes precedence and the aspect ratio is kept - for example, resizing with mode="not_smaller", size=800, max_size=1400 an image of size 1200x600 would be resized to 1400x700.

  • min_filter (nvidia.dali.types.DALIInterpType or TensorList of nvidia.dali.types.DALIInterpType, optional, default = DALIInterpType.INTERP_LINEAR) – Filter used when scaling down.

  • minibatch_size (int, optional, default = 32) – Maximum number of images that are processed in a kernel call.

  • mirror (int or TensorList of int, optional, default = 0) –

    Mask for flipping

    Supported values:

    • 0 - No flip

    • 1 - Horizontal flip

    • 2 - Vertical flip

    • 4 - Depthwise flip

    • any bitwise combination of the above

  • mode (str, optional, default = ‘default’) –

    Resize mode.

    Here is a list of supported modes:

    • "default" - image is resized to the specified size.
      Missing extents are scaled with the average scale of the provided ones.
    • "stretch" - image is resized to the specified size.
      Missing extents are not scaled at all.
    • "not_larger" - image is resized, keeping the aspect ratio, so that no extent of the output image exceeds the specified size.
      For example, a 1280x720, with a desired output size of 640x480, actually produces a 640x360 output.
    • "not_smaller" - image is resized, keeping the aspect ratio, so that no extent of the output image is smaller than specified.
      For example, a 640x480 image with a desired output size of 1920x1080, actually produces a 1920x1440 output.

      This argument is mutually exclusive with resize_longer and resize_shorter

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • resize_longer (float or TensorList of float, optional, default = 0.0) –

    The length of the longer dimension of the resized image.

    This option is mutually exclusive with resize_shorter and explicit size arguments, and the operator keeps the aspect ratio of the original image. This option is equivalent to specifying the same size for all dimensions and mode="not_larger".

  • resize_shorter (float or TensorList of float, optional, default = 0.0) –

    The length of the shorter dimension of the resized image.

    This option is mutually exclusive with resize_longer and explicit size arguments, and the operator keeps the aspect ratio of the original image. This option is equivalent to specifying the same size for all dimensions and mode="not_smaller". The longer dimension can be bounded by setting the max_size argument. See max_size argument doc for more info.

  • resize_x (float or TensorList of float, optional, default = 0.0) –

    The length of the X dimension of the resized image.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_y is unspecified or 0, the operator keeps the aspect ratio of the original image. A negative value flips the image.

  • resize_y (float or TensorList of float, optional, default = 0.0) –

    The length of the Y dimension of the resized image.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_x is unspecified or 0, the operator keeps the aspect ratio of the original image. A negative value flips the image.

  • resize_z (float or TensorList of float, optional, default = 0.0) –

    The length of the Z dimension of the resized volume.

    This option is mutually exclusive with resize_shorter, resize_longer and size. If the resize_x and resize_y are left unspecified or 0, then the op will keep the aspect ratio of the original volume. Negative value flips the volume.

  • roi_end (float or list of float or TensorList of float, optional) –

    End of the input region of interest (ROI).

    Must be specified together with roi_start. The coordinates follow the tensor shape order, which is the same as size. The coordinates can be either absolute (in pixels, which is the default) or relative (0..1), depending on the value of relative_roi argument. If the ROI origin is greater than the ROI end in any dimension, the region is flipped in that dimension.

  • roi_relative (bool, optional, default = False) – If true, ROI coordinates are relative to the input size, where 0 denotes top/left and 1 denotes bottom/right

  • roi_start (float or list of float or TensorList of float, optional) –

    Origin of the input region of interest (ROI).

    Must be specified together with roi_end. The coordinates follow the tensor shape order, which is the same as size. The coordinates can be either absolute (in pixels, which is the default) or relative (0..1), depending on the value of relative_roi argument. If the ROI origin is greater than the ROI end in any dimension, the region is flipped in that dimension.

  • rounding (str, optional, default = ‘round’) –

    Determines the rounding function used to convert the starting coordinate of the window to an integral value (see crop_pos_x, crop_pos_y, crop_pos_z).

    Possible values are:

    • "round" - Rounds to the nearest integer value, with halfway cases rounded away from zero.
    • "truncate" - Discards the fractional part of the number (truncates towards zero).

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • size (float or list of float or TensorList of float, optional) –

    The desired output size.

    Must be a list/tuple with one entry per spatial dimension, excluding video frames and channels. Dimensions with a 0 extent are treated as absent, and the output size will be calculated based on other extents and mode argument.

  • subpixel_scale (bool, optional, default = True) –

    If True, fractional sizes, directly specified or calculated, will cause the input ROI to be adjusted to keep the scale factor.

    Otherwise, the scale factor will be adjusted so that the source image maps to the rounded output size.

  • temp_buffer_hint (int, optional, default = 0) –

    Initial size in bytes, of a temporary buffer for resampling.

    Note

    This argument is ignored for the CPU variant.