Release Notes for NeMo Retriever Text Reranking NIM#

This documentation contains the release notes for NeMo Retriever Text Reranking NIM.

Release 1.6.0#

Summary#

  • Added support for Llama-3.2-nemoretriever-500m-rerank-v2 reranking model.

Release 1.5.0#

Summary#

  • Added support for B200 GPU.

Known Issues#

  • The list-model-profiles command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.

  • Slight performance degradation observed since 1.3.1 release.

Release 1.4.0#

Summary#

  • Added support for configurable memory footprint by allowing users to set batch size and sequence length.

  • Added the NIM_TRITON_MAX_BATCH_SIZE environment variable.

  • Reduced container image sizes.

  • Removed model profiles for A100 PCIe 40GB & H100 PCIe 80GB configurations.

  • Fixed bug where list-model-profiles command fails to run on hosts that don’t have an NVIDIA GPUs, even when NIM_CPU_ONLY is set.

Known Issues#

  • The list-model-profiles command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.

  • Slight performance degradation observed since 1.3.1 release.

Release 1.3.1#

Release 1.3.0#

  • Added support for Llama-3.2-NV-RerankQA-1B-v2 reranking model.

  • Added NIM_NUM_MODEL_INSTANCES and NIM_NUM_TOKENIZERS environment variables.

  • Added support for dynamic batching in the underlying Triton Inference Server process.

Known Issues#

  • The current version of langchain-nvidia-ai-endpoints used in the LangChain playbook is not compatible with the Llama-3.2-NV-RerankQA-1B-v2 NIM.

Release 1.0.2#

  • Improved accuracy for model running on A100 and A10G GPUs.

Release 1.0.1#

  • Added support for NGC Personal/Service API keys in addition to the NGC API Key (Original).

  • NGC_API_KEY is no longer required when running a container with a pre-populated cache (NIM_CACHE_PATH).

  • list-model-profiles command updated to check the correct location for model artifacts.

Release 1.0.0#

Summary#

This is the first general release of the NeMo Retriever Text Reranking NIM.

Reranking Models#

  • NV-RerankQA-Mistral4B-v3