Release Notes#
Release 1.3.1#
Added the
NIM_SERVED_MODEL_NAME
environment variable.Updated the LangChain Playbook to use the Llama-3.2-NV-RerankQA-1B-v2 NIM.
Release 1.3.0#
Added support for Llama-3.2-NV-RerankQA-1B-v2 reranking model.
Added
NIM_NUM_MODEL_INSTANCES
andNIM_NUM_TOKENIZERS
environment variables.Added support for dynamic batching in the underlying Triton Inference Server process.
Known Issues#
The current version of
langchain-nvidia-ai-endpoints
used in the LangChain playbook is not compatible with the Llama-3.2-NV-RerankQA-1B-v2 NIM.
Release 1.0.2#
Improved accuracy for model running on A100 and A10G GPUs
Release 1.0.1#
Added support for NGC Personal/Service API keys in addition to the NGC API Key (Original).
NGC_API_KEY
is no longer required when running a container with a pre-populated cache (NIM_CACHE_PATH
).list-model-profiles
command updated to check the correct location for model artifacts.
Release 1.0.0#
Summary#
This is the first general release of the NeMo Retriever Text Reranking NIM.
Reranking Models#
NV-RerankQA-Mistral4B-v3