nvidia.dali.fn.decoders.image_crop¶
- 
nvidia.dali.fn.decoders.image_crop(*inputs, **kwargs)¶
- Decodes images and extracts regions-of-interest (ROI) that are specified by fixed window dimensions and variable anchors. - When possible, the argument uses the ROI decoding APIs (for example, libjpeg-turbo and nvJPEG) to reduce the decoding time and memory usage. When the ROI decoding is not supported for a given image format, it will decode the entire image and crop the selected ROI. - The output of the decoder is in HWC layout. - Supported formats: JPG, BMP, PNG, TIFF, PNM, PPM, PGM, PBM, JPEG 2000, WebP. - Note - JPEG 2000 region-of-interest (ROI) decoding is not accelerated on the GPU, and will use a CPU implementation regardless of the selected backend. For a GPU accelerated implementation, consider using separate - decoders.imageand- cropoperators.- Note - EXIF orientation metadata is disregarded. - Supported backends
- ‘cpu’ 
- ‘mixed’ 
 
 - Parameters
- input (TensorList) – Input to the operator. 
- Keyword Arguments
- affine (bool, optional, default = True) – - Applies only to the - mixedbackend type.- If set to True, each thread in the internal thread pool will be tied to a specific CPU core. Otherwise, the threads can be reassigned to any CPU core by the operating system. 
- bytes_per_sample_hint (int or list of int, optional, default = [0]) – - Output size hint, in bytes per sample. - If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size. 
- crop (float or list of float, optional, default = [0.0, 0.0]) – - Shape of the cropped image, specified as a list of values (for example, - (crop_H, crop_W)for the 2D crop and- (crop_D, crop_H, crop_W)for the volumetric crop).- Providing crop argument is incompatible with providing separate arguments such as - crop_d,- crop_h, and- crop_w.
- crop_d (float or TensorList of float, optional, default = 0.0) – - Applies only to volumetric inputs; cropping window depth (in voxels). - crop_w,- crop_h, and- crop_dmust be specified together. Providing values for- crop_w,- crop_h, and- crop_dis incompatible with providing the fixed crop window dimensions (argument crop).
- crop_h (float or TensorList of float, optional, default = 0.0) – - Cropping the window height (in pixels). - Providing values for - crop_wand- crop_his incompatible with providing fixed crop window dimensions (argument- crop).
- crop_pos_x (float or TensorList of float, optional, default = 0.5) – - Normalized (0.0 - 1.0) horizontal position of the cropping window (upper left corner). - The actual position is calculated as - crop_x = crop_x_norm * (W - crop_W), where crop_x_norm is the normalized position,- Wis the width of the image, and- crop_Wis the width of the cropping window.
- crop_pos_y (float or TensorList of float, optional, default = 0.5) – - Normalized (0.0 - 1.0) vertical position of the start of the cropping window (typically, the upper left corner). - The actual position is calculated as - crop_y = crop_y_norm * (H - crop_H), where- crop_y_normis the normalized position, H is the height of the image, and- crop_His the height of the cropping window.
- crop_pos_z (float or TensorList of float, optional, default = 0.5) – - Applies only to volumetric inputs. - Normalized (0.0 - 1.0) normal position of the cropping window (front plane). The actual position is calculated as - crop_z = crop_z_norm * (D - crop_D), where- crop_z_normis the normalized position,- Dis the depth of the image and- crop_Dis the depth of the cropping window.
- crop_w (float or TensorList of float, optional, default = 0.0) – - Cropping window width (in pixels). - Providing values for - crop_wand- crop_his incompatible with providing fixed crop window dimensions (argument- crop).
- device_memory_padding (int, optional, default = 16777216) – - Applies only to the - mixedbackend type.- The padding for nvJPEG’s device memory allocations, in bytes. This parameter helps to avoid reallocation in nvJPEG when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image. - If a value greater than 0 is provided, the operator preallocates one device buffer of the requested size per thread. If the value is correctly selected, no additional allocations will occur during the pipeline execution. One way to find the ideal value is to do a complete run over the dataset with the - memory_statsargument set to True and then copy the largest allocation value that was printed in the statistics.
- device_memory_padding_jpeg2k (int, optional, default = 0) – - Applies only to the - mixedbackend type.- The padding for nvJPEG2k’s device memory allocations, in bytes. This parameter helps to avoid reallocation in nvJPEG2k when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image. - If a value greater than 0 is provided, the operator preallocates the necessary number of buffers according to the hint provided. If the value is correctly selected, no additional allocations will occur during the pipeline execution. One way to find the ideal value is to do a complete run over the dataset with the - memory_statsargument set to True and then copy the largest allocation value that was printed in the statistics.
- host_memory_padding (int, optional, default = 8388608) – - Applies only to the - mixedbackend type.- The padding for nvJPEG’s host memory allocations, in bytes. This parameter helps to prevent the reallocation in nvJPEG when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image. - If a value greater than 0 is provided, the operator preallocates two (because of double-buffering) host-pinned buffers of the requested size per thread. If selected correctly, no additional allocations will occur during the pipeline execution. One way to find the ideal value is to do a complete run over the dataset with the - memory_statsargument set to True, and then copy the largest allocation value that is printed in the statistics.
- host_memory_padding_jpeg2k (int, optional, default = 0) – - Applies only to the - mixedbackend type.- The padding for nvJPEG2k’s host memory allocations, in bytes. This parameter helps to prevent the reallocation in nvJPEG2k when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image. - If a value greater than 0 is provided, the operator preallocates the necessary number of buffers according to the hint provided. If the value is correctly selected, no additional allocations will occur during the pipeline execution. One way to find the ideal value is to do a complete run over the dataset with the - memory_statsargument set to True, and then copy the largest allocation value that is printed in the statistics.
- hw_decoder_load (float, optional, default = 0.65) – - The percentage of the image data to be processed by the HW JPEG decoder. - Applies only to the - mixedbackend type in NVIDIA Ampere GPU architecture.- Determines the percentage of the workload that will be offloaded to the hardware decoder, if available. The optimal workload depends on the number of threads that are provided to the DALI pipeline and should be found empirically. More details can be found at https://developer.nvidia.com/blog/loading-data-fast-with-dali-and-new-jpeg-decoder-in-a100 
- hybrid_huffman_threshold (int, optional, default = 1000000) – - Applies only to the - mixedbackend type.- Images with a total number of pixels ( - height * width) that is higher than this threshold will use the nvJPEG hybrid Huffman decoder. Images that have fewer pixels will use the nvJPEG host-side Huffman decoder.- Note - Hybrid Huffman decoder still largely uses the CPU. 
- memory_stats (bool, optional, default = False) – - Applies only to the - mixedbackend type.- Prints debug information about nvJPEG allocations. The information about the largest allocation might be useful to determine suitable values for - device_memory_paddingand- host_memory_paddingfor a dataset.- Note - The statistics are global for the entire process, not per operator instance, and include the allocations made during construction if the padding hints are non-zero. 
- output_type (nvidia.dali.types.DALIImageType, optional, default = DALIImageType.RGB) – - The color space of the output image. - Note: When decoding to YCbCr, the image will be decoded to RGB and then converted to YCbCr, following the YCbCr definition from ITU-R BT.601. 
- preallocate_height_hint (int, optional, default = 0) – - Image width hint. - Applies only to the - mixedbackend type in NVIDIA Ampere GPU architecture.- The hint is used to preallocate memory for the HW JPEG decoder. 
- preallocate_width_hint (int, optional, default = 0) – - Image width hint. - Applies only to the - mixedbackend type in NVIDIA Ampere GPU architecture.- The hint is used to preallocate memory for the HW JPEG decoder. 
- preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used. 
- seed (int, optional, default = -1) – - Random seed. - If not provided, it will be populated based on the global seed of the pipeline. 
- split_stages (bool) – - Warning - The argument - split_stagesis no longer used and will be removed in a future release.
- use_chunk_allocator (bool) – - Warning - The argument - use_chunk_allocatoris no longer used and will be removed in a future release.
- use_fast_idct (bool, optional, default = False) – - Enables fast IDCT in the libjpeg-turbo based CPU decoder, used when - deviceis set to “cpu” or when the it is set to “mixed” but the particular image can not be handled by the GPU implementation.- According to the libjpeg-turbo documentation, decompression performance is improved by up to 14% with little reduction in quality.