Multinode Examples | NVIDIA Dynamo Documentation

For general TensorRT-LLM features and engine configuration, see the Reference Guide.

Recommended Path

For multinode TensorRT-LLM deployments, start from the checked-in Kubernetes recipes under recipes/. Those manifests are the supported entrypoints for launching multi-node workers, frontend services, and related routing components.

The main TRT-LLM recipe entrypoints are:

For model-level setup, prerequisites, and hardware notes, use the recipe README files:

Quick Start

At a high level, the Kubernetes workflow is:

Install the Dynamo platform on Kubernetes. See the Kubernetes Deployment Guide.
Create a namespace and any required secrets such as a Hugging Face token.
Apply the recipe’s model cache and model download manifests when the recipe includes them.
Apply the recipe’s deploy.yaml.
Port-forward the frontend service and send test requests to /v1/models or /v1/chat/completions.

Example flow:

$ export NAMESPACE=dynamo-demo
$ kubectl create namespace ${NAMESPACE}
$ 
$ kubectl create secret generic hf-token-secret \
>   --from-literal=HF_TOKEN="your-token-here" \
>   -n ${NAMESPACE}
$ 
$ # Example: deploy DeepSeek-R1 TRT-LLM WideEP on GB200.
$ kubectl apply -f recipes/deepseek-r1/model-cache/model-cache.yaml -n ${NAMESPACE}
$ kubectl apply -f recipes/deepseek-r1/model-cache/model-download.yaml -n ${NAMESPACE}
$ kubectl wait --for=condition=Complete job/model-download -n ${NAMESPACE} --timeout=7200s
$ kubectl apply -f recipes/deepseek-r1/trtllm/disagg/wide_ep/gb200/deploy.yaml -n ${NAMESPACE}

After the deployment is ready, port-forward the frontend service named by the recipe and send a test request:

$ kubectl port-forward svc/<frontend-service> 8000:8000 -n ${NAMESPACE}
$ 
$ curl http://localhost:8000/v1/models

Notes

The TRT-LLM engine config files used by launch and deploy flows live under examples/backends/trtllm/engine_configs/.
If you need to customize model parallelism, replica counts, or routing mode, edit the recipe-local manifest rather than introducing a separate scheduler-specific guide.
For the current catalog of supported recipes, see recipes/README.md.