Getting Started#

Prerequisites#

Check the Support Matrix to make sure that you have the supported hardware and software stack.

NGC Authentication#

Generate an API Key#

An NGC API key is required to access NGC resources. You can generate a key at NGC API Keys.

When creating an NGC API Personal key, ensure that at least NGC Catalog is selected from the Services Included dropdown. If this key is to be reused for other purposes, you can include more services.

Note

Personal keys allow you to configure an expiration date, revoke or delete the key using an action button, and rotate the key as needed. For more information about key types, refer to NGC API Keys in the NGC User Guide.

Export the API Key#

Pass the value of the API key to the docker run command in the next section as the NGC_API_KEY environment variable to download the appropriate models and resources when starting the NIM.

If you are not familiar with how to create the NGC_API_KEY environment variable, the simplest way is to export it in your terminal:

export NGC_API_KEY=<value>

Run one of the following commands to make the key available at startup:

# If using bash
echo "export NGC_API_KEY=<value>" >> ~/.bashrc

# If using zsh
echo "export NGC_API_KEY=<value>" >> ~/.zshrc

Note

Other, more secure options include saving the value in a file, so that you can retrieve with cat $NGC_API_KEY_FILE, or using a password manager.

Docker Login to NGC#

To pull the NIM container image from NGC, first authenticate with the NVIDIA Container Registry with the following command:

echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

Use $oauthtoken as the username and NGC_API_KEY as the password. The $oauthtoken username is a special name that indicates that you will authenticate with an API key and not a user name and password.

Launching the NIM Container#

The following command launches the Background Noise Removal NIM container with the gRPC service. For a list of parameters, see Runtime Parameters for the Container.

Transactional NIM:

docker run -it --rm --name=bnr \
    --runtime=nvidia \
    --gpus all \
    --shm-size=8GB \
    -e NGC_API_KEY=$NGC_API_KEY \
    -e NIM_MODEL_PROFILE=<nim_model_profile> \
    -e MAXINE_MAX_CONCURRENCY_PER_GPU=1 \
    -e FILE_SIZE_LIMIT=36700160 \
    -e STREAMING=false \
    -p 8000:8000 \
    -p 8001:8001 \
    nvcr.io/nim/nvidia/maxine-bnr:latest

Streaming NIM:

docker run -it --rm --name=bnr \
    --runtime=nvidia \
    --gpus all \
    --shm-size=8GB \
    -e NGC_API_KEY=$NGC_API_KEY \
    -e NIM_MODEL_PROFILE=<nim_model_profile> \
    -e MAXINE_MAX_CONCURRENCY_PER_GPU=1 \
    -e STREAMING=true \
    -p 8000:8000 \
    -p 8001:8001 \
    nvcr.io/nim/nvidia/maxine-bnr:latest

Note

The flag --gpus all is used to assign all available GPUs to the Docker container. This fails on multiple GPU unless all GPUs are same. To assign specific GPUs to the Docker container (when multiple GPUs are available in your machine), use --gpus '"device=0,1,2..."'.

If the command runs successfully, you get an output ending similar to the following:

I1126 09:22:21.048202 31 grpc_server.cc:2558] "Started GRPCInferenceService at 127.0.0.1:9001"
I1126 09:22:21.048377 31 http_server.cc:4704] "Started HTTPService at 127.0.0.1:9000"
I1126 09:22:21.089295 31 http_server.cc:362] "Started Metrics Service at 127.0.0.1:9002"
Maxine GRPC Service: Listening to 0.0.0.0:8001

Note

By default, the Background Noise Removal gRPC service is hosted on port 8001. You must use this port for inferencing requests.

Selecting a Model Profile#

You can select a model profile by using the optional NIM_MODEL_PROFILE parameter. If you don’t provide NIM_MODEL_PROFILE, the NIM automatically selects a matching NIM_MODEL_PROFILE based on the target hardware architecture of Model Type v1-48k.

However, if you specify NIM_MODEL_PROFILE, ensure that the associated GPU architecture is compatible with the target hardware. An incorrect NIM_MODEL_PROFILE produces a deserialization error on inference.

If a non-supported GPU is used for launching the NIM, you get an error: nimlib.exceptions.NIMProfileIDNotFound: Could not match a profile in manifest at /opt/nim/etc/default/model_manifest.yaml.

For more information about NIM_MODEL_PROFILE, refer to the NIM Model Profile Table.

Environment Variables#

The following table describes the environment variables that can be passed into a NIM as a -e argument added to a docker run command:

ENV

Required?

Default

Notes

NGC_API_KEY

Yes

None

You must set this variable to the value of your personal NGC API key.

NIM_CACHE_PATH

No

/opt/nim/.cache

Location (in container) where the container caches model artifacts.

NIM_MODEL_PROFILE

Optional

None

To download a specific model type that is supported on your GPU, you must set this variable. To learn more about NIM_MODEL_PROFILE, refer to the NIM Model Profile Table.

FILE_SIZE_LIMIT

No

36700160

Maximum size of the input audio file in bytes. This is applicable only for the transactional mode. Defaults to 35 MB.

STREAMING

No

false

To enable audio streaming mode on the gRPC endpoint, set to true.

NIM_SSL_MODE

No

disabled

Set SSL security on the endpoints to tls or mtls. Defaults to unsecured endpoint.

NIM_SSL_CA_PATH

No

None

Set the path to the CA root certificate inside the NIM. This is required only when NIM_SSL_MODE is mtls. For example, if the SSL certificates are mounted at /opt/nim/crt in the NIM, NIM_SSL_CA_PATH can be set to /opt/nim/crt/ssl_ca_cert.pem.

NIM_SSL_CERT_PATH

No

None

Set the path to the server’s public SSL certificate inside the NIM. This is required only when an SSL mode is enabled. For example, if the SSL certificates are mounted at /opt/nim/crt in the NIM, NIM_SSL_CERT_PATH can be set to /opt/nim/crt/ssl_cert_server.pem.

NIM_SSL_KEY_PATH

No

None

Set the path to the server’s private key inside the NIM. This is required only when an SSL mode is enabled. For example, if the SSL certificates are mounted at /opt/nim/crt in the NIM,NIM_SSL_KEY_PATH can be set to /opt/nim/crt/ssl_key_server.pem.

MAXINE_MAX_CONCURRENCY_PER_GPU

No

128

Number of concurrent inference requests to be supported by the NIM server per GPU. If you encounter a “Resource unavailable” error, reduce this value when launching the NIM container.

Runtime Parameters for the Container#

Flags

Description

-it

--interactive + --tty (see docker container run).

--rm

Delete the container after it stops (see docker container run).

--name=<container_name>

Give a name to the NIM container. Use any preferred value.

--runtime=nvidia

Ensure NVIDIA drivers are accessible in the container.

--gpus all

Expose NVIDIA GPUs inside the container. If you are running on a host with multiple GPUs, you need to specify which GPU to use. You can also specify multiple GPUs. For more information about mounting specific GPUs, see GPU Enumeration.

--shm-size=8GB

Allocate host memory for multi-process communication.

-e NGC_API_KEY=$NGC_API_KEY

Provide the container with the token necessary to download adequate models and resources from NGC. See NGC Authentication.

-p <host_port>:<container_port>

Ports published by the container are directly accessible on the host port.

Stopping the Container#

The following commands can be used to stop the container.

docker stop $CONTAINER_NAME
docker rm $CONTAINER_NAME