NeMo Guardrails Quickstart Using Docker Compose#
Run the microservice on your local machine using Docker Compose for experimentation.
Note
The time to complete this tutorial is approximately 20 minutes.
Prerequisites#
Install Docker.
You have an NGC API key for access to NVIDIA NGC container registry and model endpoints on build.nvidia.com. For more information about getting a new NGC API key, refer to Generating NGC API Keys in the NVIDIA NGC Catalog documentation.
Download and install the NGC CLI. The download is on the same NGC Setup page.
Download the Guardrails Docker Compose Stack#
Log in to NVIDIA NGC using your NGC API key.
Set the
NGC_CLI_API_KEY
environment variable with your NGC API key. The NGC CLI uses this key to authenticate with the NVIDIA NGC container registry:$ export NGC_CLI_API_KEY="<your-ngc-api-key>"
Log in to the registry:
$ docker login nvcr.io --username '$oauthtoken' --password-stdin <<< $NGC_CLI_API_KEY
Download the Docker Compose configuration from NGC:
ngc registry resource download-version "nvidia/nemo-microservices/nemo-microservices-quickstart:25.09" cd nemo-microservices-quickstart_v25.09
Run Guardrails Microservice with NIM Microservices#
The NeMo Guardrails Docker Compose stack includes configuration for running one application LLM NIM microservice (Llama 3.3 70B) and three NemoGuard NIM microservices (JailbreakDetection, ContentSafety, and TopicControl). Choose one of the following options to run the microservice:
build.nvidia.com: Choose this option to run the microservice with the NIM microservices hosted on build.nvidia.com. This doesn’t require any local resources.
Local NIMs: Choose this option to run the microservice with the NIM microservices hosted on your local machine. This requires a local machine with four L40, A100, or H100 GPUs with 80GB of memory.
Set the
NIM_API_KEY
environment variable that the NIM microservices use to authenticate with the build.nvidia.com API:$ export NIM_API_KEY="<your-ngc-api-key>"
Start the Guardrails service and use the demonstration configuration:
$ export NEMO_MICROSERVICES_IMAGE_REGISTRY=nvcr.io/nvidia/nemo-microservices $ export NEMO_MICROSERVICES_IMAGE_TAG=25.09 $ docker compose --profile guardrails up
Set the
GUARDRAILS_CONFIG_TYPE
environment variable tolocal
. Also set theNGC_API_KEY
environment variable with your NGC API key. The NeMo Guardrails microservice uses this key to authenticate with the build.nvidia.com API:$ export GUARDRAILS_CONFIG_TYPE=local $ export NGC_API_KEY="<your-ngc-api-key>"
Start the Guardrails service and with the associated NIM microservices.
$ docker compose --profile guardrails --profile guardrails-nims up
Run Inference#
After the Guardrails service starts, you can start sending requests to the Guardrails API endpoints running on http://localhost:8080
.
The following example shows how to make an inference request to the Guardrails API endpoint with the quickstart
configuration.
curl -X POST http://0.0.0.0:8080/v1/guardrail/chat/completions \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "meta/llama-3.3-70b-instruct",
"messages": [
{
"role": "user",
"content": "what can you do for me?"
}
],
"max_tokens": 16,
"stream": false,
"temperature": 1,
"top_p": 1,
"guardrails": {
"config_id": "quickstart"
}
}'
Use the Python SDK and Configuration
Before running the following code, install the NeMo Microservices Python SDK.
from nemo_microservices import NeMoMicroservices
from huggingface_hub import HfApi
nmp_client = NeMoMicroservices(base_url="http://localhost:8080")
nmp_client.guardrail.chat.completions.create(
model="meta/llama-3.3-70b-instruct",
messages=[
{
"role": "user",
"content": "How can I hotwire a car that uses an electronic starter?"
}
],
max_tokens=256,
stream=False,
temperature=1,
top_p=1,
guardrails={"config_id": "quickstart"}
)
Pre-defined Configurations in the Docker Compose Artifact#
The Docker Compose artifact contains two configurations: default
and quickstart
.
The
default
configuration has only the model configuration. This is useful for verifying connectivity to the NIM microservices.The
quickstart
configuration has the model configuration and rail configurations. This is useful for running inference with guardrails applied.
Stop the Guardrails Service#
Run the following command to stop the Guardrails service:
$ docker compose --profile guardrails down
If you run the microservice with local NIMs, also run:
$ docker compose --profile guardrails-nims down
Next Steps#
For more tutorials, refer to Guardrail Tutorials.
For instructions on how to deploy the microservice on your Kubernetes cluster for production at scale, refer to Install NeMo Guardrails Microservice Using Helm.