Getting Started

Prerequisites

Check the support matrix to make sure that you have the supported hardware and software stack.

NGC Authentication

Generate an API key

An NGC API key is required to access NGC resources and a key can be generated here: https://org.ngc.nvidia.com/setup/personal-keys.

When creating an NGC API Personal key, ensure that at least “NGC Catalog” is selected from the “Services Included” dropdown. More Services can be included if this key is to be reused for other purposes.

Note

Personal keys allow you to configure an expiration date, revoke or delete the key using an action button, and rotate the key as needed. For more information about key types, please refer the NGC User Guide.

Export the API key

Pass the value of the API key to the docker run command in the next section as the NGC_API_KEY environment variable to download the appropriate models and resources when starting the NIM.

If you’re not familiar with how to create the NGC_API_KEY environment variable, the simplest way is to export it in your terminal:

export NGC_API_KEY=<value>

Run one of the following commands to make the key available at startup:

# If using bash
echo "export NGC_API_KEY=<value>" >> ~/.bashrc

# If using zsh
echo "export NGC_API_KEY=<value>" >> ~/.zshrc

Note

Other, more secure options include saving the value in a file, so that you can retrieve with cat $NGC_API_KEY_FILE, or using a password manager.

Docker Login to NGC

To pull the NIM container image from NGC, first authenticate with the NVIDIA Container Registry with the following command:

echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

Use $oauthtoken as the username and NGC_API_KEY as the password. The $oauthtoken username is a special name that indicates that you will authenticate with an API key and not a user name and password.

Launching the NIM

The following command launches a Docker container for PaddleOCR NIM.

# Choose a container name for bookkeeping
export CONTAINER_NAME=paddleocr

# Choose a NIM Image from NGC
export IMG_NAME="nvcr.io/ohlfw0olaadg/ea-participants/paddleocr:0.2.0"

# Choose a path on your system to cache the downloaded models. For the container to be able to write to this directory, you must set the permissions of the directory to allow others to write to it.
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
chmod o+w "$LOCAL_NIM_CACHE"

# Start the NIM
docker run -it --rm \
    --runtime=nvidia \
    --name=$CONTAINER_NAME \
    --gpus all \
    -p 8000:8000 \
    -e NGC_API_KEY=$NGC_API_KEY \
    -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
    $IMG_NAME

Flags

Description

-it

--interactive + --tty (see Docker docs).

--rm

Delete the container after it stops (see Docker docs).

--name=paddleocr

Give a name to the NIM container for bookkeeping (here paddleocr). Use any preferred value.

--runtime=nvidia

Ensure NVIDIA drivers are accessible in the container.

--gpus all

Expose all NVIDIA GPUs inside the container. See the configuration page for mounting specific GPUs.

-e NGC_API_KEY

Provide the container with the token necessary to download adequate models and resources from NGC. See above.

-v "$LOCAL_NIM_CACHE:/opt/nim/.cache"

Mount a cache directory from your system (~/.cache/nim here) inside the NIM (defaults to /opt/nim/.cache), allowing downloaded models and artifacts to be reused by follow-up runs.

-p 8000:8000

Forward the port where the NIM server is published inside the container to access from the host system. The left-hand side of : is the host system ip:port (8000 here), while the right-hand side is the container port where the NIM server is published (defaults to 8000).

$IMG_NAME

Name and version of the NIM container from NGC. The NIM server automatically starts if no argument is provided after this.

API Calls

Once the NIM is up and running, you can make API calls. Refer to the API Reference section for sample API calls and the API specification.

Stopping the Container

The following commands stop the container by stopping and removing the running docker container.

docker stop $CONTAINER_NAME
docker rm $CONTAINER_NAME

See Also

Model Warmup

The first few (depending on number of GPUs) inference requests may take longer than subsequent ones. This is due to the model being loaded into memory and initialized for the first time. To avoid this delay, you can send a warm-up request to the NIM server before sending actual inference requests.