Troubleshoot NVIDIA NeMo Retriever Embedding NIM#

Use this documentation to troubleshoot issues that arise when you use NVIDIA NeMo Retriever Embedding NIM.

list-model-profiles command is not available#

Starting in version 2.0.0, NeMo Retriever Embedding NIM automatically selects an optimized inference pipeline at startup. No user action is required to select profiles, and the list-model-profiles command is not available. For details, refer to Automatic Pipeline Selection.

To request a specific precision, use NIM_PRECISION. For details, refer to Precision Override.

NIM fails to start#

If you attempt to start a NIM, in some cases it fails to start when you run the docker run command. Some NIM models require that you accept the license terms on NGC before you can pull the container image and model assets. To resolve this issue, browse to the model page on the NGC Catalog, read and then click Accept Terms. For details, refer to Get Started.

NIM fails to start with out-of-memory error#

If you attempt to start a NIM, in some cases it fails to start with an out-of-memory error.

NIMs pre-allocate memory in accordance with the maximum input size that may be operated on. Engine instances are a multiplier on VRAM requirements, and different NIMs require widely different amounts of VRAM.

To resolve this issue, use one of the following options:

When you run a NIM on a small VRAM card, adjust the NIM_MAX_BATCH_SIZE and NIM_MAX_SEQ_LEN environment variables.
If you configured multiple engine instances, set NIM_ENGINE_COUNT=1.

Troubleshoot NVIDIA NeMo Retriever Embedding NIM#

list-model-profiles command is not available#

NIM fails to start#

NIM fails to start with out-of-memory error#

Related Topics#