Release Notes for NeMo Retriever Text Reranking NIM#

This documentation contains the release notes for NeMo Retriever Text Reranking NIM.

Release 1.6.0#

Summary#

Added support for Llama-3.2-nemoretriever-500m-rerank-v2 reranking model.

Release 1.5.0#

Summary#

Added support for B200 GPU.

Known Issues#

The list-model-profiles command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.
Slight performance degradation observed since 1.3.1 release.

Release 1.4.0#

Summary#

Added support for configurable memory footprint by allowing users to set batch size and sequence length.
Added the NIM_TRITON_MAX_BATCH_SIZE environment variable.
Reduced container image sizes.
Removed model profiles for A100 PCIe 40GB & H100 PCIe 80GB configurations.
Fixed bug where list-model-profiles command fails to run on hosts that don’t have an NVIDIA GPUs, even when NIM_CPU_ONLY is set.

Known Issues#

The list-model-profiles command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.
Slight performance degradation observed since 1.3.1 release.

Release 1.3.1#

Added the NIM_SERVED_MODEL_NAME environment variable.
Updated the LangChain Playbook to use the Llama-3.2-NV-RerankQA-1B-v2 NIM.

Release 1.3.0#

Added support for Llama-3.2-NV-RerankQA-1B-v2 reranking model.
Added NIM_NUM_MODEL_INSTANCES and NIM_NUM_TOKENIZERS environment variables.
Added support for dynamic batching in the underlying Triton Inference Server process.

Known Issues#

The current version of langchain-nvidia-ai-endpoints used in the LangChain playbook is not compatible with the Llama-3.2-NV-RerankQA-1B-v2 NIM.

Release 1.0.2#

Improved accuracy for model running on A100 and A10G GPUs.

Release 1.0.1#

Added support for NGC Personal/Service API keys in addition to the NGC API Key (Original).
NGC_API_KEY is no longer required when running a container with a pre-populated cache (NIM_CACHE_PATH).
list-model-profiles command updated to check the correct location for model artifacts.

Release 1.0.0#

Summary#

This is the first general release of the NeMo Retriever Text Reranking NIM.

Reranking Models#

NV-RerankQA-Mistral4B-v3