NVIDIA NIM currently is in limited availability, sign up here to get notified when the latest NIMs are available to download.

NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of easy-to-use microservices designed to speed up generative AI deployment in enterprises. Supporting a wide range of AI models, including NVIDIA AI foundation and custom models, it ensures seamless, scalable AI inferencing, on-premises or in the cloud, leveraging industry-standard APIs.

NIMs are containers that provide interactive APIs for running inference on an AI Model. In general, NIMs have:

  • An API layer

  • A server layer

  • A runtime layer

  • A model “engine”

NIMs have two components: the docker container and the model (weights and biases). The docker containers are obtained by pulling from the NVIDIA Docker Registry on NGC, while the models may come from NGC or other sources. Some NIMs with small model files ship the models inside of the container itself.


The following are the requirements necessary to use all NIMs. Specific requirements for individual NIMs are documented in their respective documentation pages.

Hardware and Operating System

  • Linux with an x86_64/AMD64 processor. ARM processor support is available for select NIMs. See the individual NIM documentation for details.

  • At least one NVIDIA GPU. NIMs with large models (e.g., LLMs) are optimized with pre-compiled TensorRT engines and therefore have specific GPU model requirements. See the individual documentation for details.

Prerequisite Software

  • Install Docker

  • Install the NVIDIA Container Toolkit

    • Verify your container runtime supports NVIDIA GPUs by running

      docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

      Example output:

       2| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
       4| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
       5| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
       6|                               |                      |               MIG M. |
       8|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
       9| 41%   30C    P8     1W / 260W |   2244MiB / 11264MiB |      0%      Default |
      10|                               |                      |                  N/A |
      14| Processes:                                                                  |
      15|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      16|        ID   ID                                                   Usage      |
    • For more information on enumerating multi-GPU systems, please see the NVIDIA Container Toolkit’s GPU Enumeration Docs

NGC (NVIDIA GPU Cloud) Account

  1. Create an account on NGC

  2. Generate an API Key

  3. Docker log in with your NGC API key using docker login nvcr.io --username='$oauthtoken' --password=${NGC_CLI_API_KEY}


  1. Download the NGC CLI tool for your OS.


    Use NGC CLI version 3.41.1 or newer. Here is the command to install this on AMD64 Linux in your home directory:

    1wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.41.3/files/ngccli_linux.zip -O ~/ngccli_linux.zip && \
    2unzip ~/ngccli_linux.zip -d ~/ngc && \
    3chmod u+x ~/ngc/ngc-cli/ngc && \
    4echo "export PATH=\"\$PATH:~/ngc/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile
  2. Set up your NGC CLI Tool locally (You’ll need your API key for this!):

    ngc config set


    After you enter your API key, you may see multiple options for the org and team. Select as desired or hit enter to accept the default.

Individual NIM Documentation



Required GPUs

Minimum GPU Memory

Model Source

CPU Architecture Support



Single Hopper, Ampere, or Ada GPU

24 GB

Shipped with image




Single Hopper, Ampere, or Ada GPU

24 GB

Shipped with image