Release Notes for NeMo Retriever Text Reranking NIM#
This documentation contains the release notes for NeMo Retriever Text Reranking NIM.
Release 1.6.0#
Summary#
Added support for Llama-3.2-nemoretriever-500m-rerank-v2 reranking model.
Release 1.5.0#
Summary#
Added support for B200 GPU.
Known Issues#
The
list-model-profiles
command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.Slight performance degradation observed since 1.3.1 release.
Release 1.4.0#
Summary#
Added support for configurable memory footprint by allowing users to set batch size and sequence length.
Added the
NIM_TRITON_MAX_BATCH_SIZE
environment variable.Reduced container image sizes.
Removed model profiles for A100 PCIe 40GB & H100 PCIe 80GB configurations.
Fixed bug where
list-model-profiles
command fails to run on hosts that don’t have an NVIDIA GPUs, even whenNIM_CPU_ONLY
is set.
Known Issues#
The
list-model-profiles
command incorrectly lists compatible model profiles as incompatible. Select the profile that matches your hardware configuration. This bug does not impact automatic profile selection.Slight performance degradation observed since 1.3.1 release.
Release 1.3.1#
Added the
NIM_SERVED_MODEL_NAME
environment variable.Updated the LangChain Playbook to use the Llama-3.2-NV-RerankQA-1B-v2 NIM.
Release 1.3.0#
Added support for Llama-3.2-NV-RerankQA-1B-v2 reranking model.
Added
NIM_NUM_MODEL_INSTANCES
andNIM_NUM_TOKENIZERS
environment variables.Added support for dynamic batching in the underlying Triton Inference Server process.
Known Issues#
The current version of
langchain-nvidia-ai-endpoints
used in the LangChain playbook is not compatible with the Llama-3.2-NV-RerankQA-1B-v2 NIM.
Release 1.0.2#
Improved accuracy for model running on A100 and A10G GPUs.
Release 1.0.1#
Added support for NGC Personal/Service API keys in addition to the NGC API Key (Original).
NGC_API_KEY
is no longer required when running a container with a pre-populated cache (NIM_CACHE_PATH
).list-model-profiles
command updated to check the correct location for model artifacts.
Release 1.0.0#
Summary#
This is the first general release of the NeMo Retriever Text Reranking NIM.
Reranking Models#
NV-RerankQA-Mistral4B-v3