Troubleshoot NeMo Retriever Text Reranking NIM#

Use this documentation to troubleshoot issues that arise when you use NeMo Retriever Text Reranking NIM.

list-model-profiles command fails#

Some older NIMs don’t support the list-model-profiles command, including the following:

  • nv-rerankqa-mistral-4b-v3

  • nv-embedqa-mistral-7b-v2

  • arctic-embed-l

NIM fails to start with out-of-memory error#

If you attempt to start a NIM, in some cases it fails to start with an out-of-memory error.

TensorRT pre-allocates memory in accordance with the maximum input size based on the loaded TensorRT profiles. Model instances are a multiplier on VRAM requirements, and different NIMs require widely different amounts of VRAM.

To resolve this issue, use one of the following options:

  • When you run a TensorRT profile on a small VRAM card, adjust the NIM_TRITON_MAX_BATCH_SIZE and NIM_TRITON_MAX_SEQ_LENGTH environment variables.

  • On GPUs without enough VRAM for multiple model instances, run only a single instance of the reranker.