Troubleshoot NeMo Retriever Text Reranking NIM#
Use this documentation to troubleshoot issues that arise when you use NeMo Retriever Text Reranking NIM.
list-model-profiles command fails#
Some older NIMs don’t support the list-model-profiles
command,
including the following:
nv-rerankqa-mistral-4b-v3
nv-embedqa-mistral-7b-v2
arctic-embed-l
NIM fails to start with out-of-memory error#
If you attempt to start a NIM, in some cases it fails to start with an out-of-memory error.
TensorRT pre-allocates memory in accordance with the maximum input size based on the loaded TensorRT profiles. Model instances are a multiplier on VRAM requirements, and different NIMs require widely different amounts of VRAM.
To resolve this issue, use one of the following options:
When you run a TensorRT profile on a small VRAM card, adjust the
NIM_TRITON_MAX_BATCH_SIZE
andNIM_TRITON_MAX_SEQ_LENGTH
environment variables.On GPUs without enough VRAM for multiple model instances, run only a single instance of the reranker.