Release Notes for NVIDIA NIM for Image OCR (NeMo Retriever OCR v1)#

This documentation contains the release notes for NVIDIA NIM for Image OCR (NeMo Retriever OCR v1).

Release 1.2.1#

This is a patch release of the Image OCR NIM.
Image OCR NIM now selects the smallest sufficient TensorRT profile for the configured NIM_TRITON_MAX_BATCH_SIZE instead of loading all TensorRT profiles simultaneously.
HTTP responses with code 422 now have body formats that comply with the OpenAPI standard.
Added the NIM_TRITON_MAX_QUEUE_DELAY_MICROSECONDS alias for backward compatibility with the EA release of Image OCR NIM. For details, refer to environment variables.
Changed the NIM_TRITON_MAX_QUEUE_DELAY_MICROSECONDS default value from 0 to 100. For details, refer to environment variables.

The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

This is the first General Access release of the NVIDIA NIM for Image OCR (NeMo Retriever OCR v1).
Added TRT optimized engines for CUDA GPU Compute Capability. Support includes 12.0, 10.0, 9.0, 8.9, 8.6, and 8.0.
The NIM_TRITON_OPTIMIZATION_MODE environment variable is no longer supported.

The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

Upgraded to use Triton Inference Server 25.08 to address CVEs.
Added Triton Ensemble Configuration which supports configuring the underlying Triton Ensemble model pipeline.
Added the NIM_TRITON_PINNED_MEMORY_POOL_MB environment variable.
Added the NIM_TRITON_ENABLE_MODEL_CONTROL environment variable.
Added the NIM_TRITON_IDLE_BYTES_LIMIT environment variable.
Added the NIM_TRITON_FLUSH_INTERVAL environment variable.
Added the NIM_TRITON_RATE_LIMIT environment variable.

The persistence.enabled value and all related dependent configuration flags are currently non-functional in the NIM helm chart.

This is the first Early Access release of the NVIDIA NIM for Image OCR (NeMo Retriever OCR v1).