Release Notes for NVIDIA NIM for Image OCR (NeMo Retriever OCR)#

This documentation contains the release notes for NVIDIA NIM for Image OCR (NeMo Retriever OCR).

Release 1.3.0#

Highlights#

  • Rename existing models to the new Nemotron brand. The impacted models are the following:

    • The nemoretriever-ocr-v1 model is now named nemotron-ocr-v1.

  • Add fixes for high and critical vulnerabilities.

  • Added performance optimizations.

  • Added the following new environment variables. For details, refer to environment variables.

    • NIM_TRITON_DATA_MAX_BATCH_SIZE

    • NIM_TRITON_DYNAMIC_BATCHING_ENABLED

    • NIM_TRITON_ENABLE_ASYNC_MODEL_EXECUTION

    • NIM_TRITON_ENABLE_PIPELINE_TIMING

    • NIM_TRITON_GPU_DECODING_BATCH_THRESHOLD

    • NIM_TRITON_MODEL_MAX_BATCH_SIZE

    • NIM_TRITON_MODEL_MAX_QUEUE_DELAY_MICROSECONDS

    • NIM_TRITON_PIPELINE_MAX_BATCH_SIZE

    • NIM_TRITON_PIPELINE_MAX_QUEUE_DELAY_MICROSECONDS

    • NIM_TRITON_PIPELINE_TIMING_INTERVAL

    • NIM_TRITON_WORKER_INSTANCE_COUNT

Fixed Known Issues#

The following are the known issues that are fixed in this version:

  • Fixed an issue with the persistence.enabled helm chart value. Persistent storage options (persistence.storageClass, persistence.existingClaim, hostPath.enabled) are now fully functional.

  • Memory is now freed after inference requests complete.

  • NIM_TRITON_MODEL_INSTANCE_COUNT now controls only the number of model instances, not the number of pipeline workers.

Known Issues#

  • Setting NIM_TRITON_ENABLE_MODEL_CONTROL=true causes a race condition in which model warmup is attempted before the models are loaded.

Release 1.2.1#

Summary#

  • This is a patch release of the Image OCR NIM (NeMo Retriever OCR).

  • Image OCR NIM (NeMo Retriever OCR) now selects the smallest sufficient TensorRT profile for the configured NIM_TRITON_MAX_BATCH_SIZE instead of loading all TensorRT profiles simultaneously.

  • HTTP responses with code 422 now have body formats that comply with the OpenAPI standard.

  • Added the NIM_TRITON_MAX_QUEUE_DELAY_MICROSECONDS alias for backward compatibility with the EA release of Image OCR NIM (NeMo Retriever OCR). For details, refer to environment variables.

  • Changed the NIM_TRITON_MAX_QUEUE_DELAY_MICROSECONDS default value from 0 to 100. For details, refer to environment variables.

Known Issues#

  • The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

Release 1.2.0#

Summary#

  • This is the first General Access release of the NVIDIA NIM for Image OCR (NeMo Retriever OCR).

  • Added TRT optimized engines for CUDA GPU Compute Capability. Support includes 12.0, 10.0, 9.0, 8.9, 8.6, and 8.0.

  • The NIM_TRITON_OPTIMIZATION_MODE environment variable is no longer supported.

Known Issues#

  • The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

Release 1.1.0#

Summary#

Known Issues#

  • The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

Release 1.0.0#

Summary#

This is the first Early Access release of the NVIDIA NIM for Image OCR (NeMo Retriever OCR).

Known Issues#

  • This release only supports a single GPU-agnostic PyTorch backend profile.