Building a Custom TensorRT-LLM Container
Building a Custom TensorRT-LLM Container
For the prebuilt container, see the TensorRT-LLM Quick Start.
How the Image Is Composed
The Dynamo TensorRT-LLM image layers Dynamo on top of the upstream NVIDIA TensorRT-LLM release container — it does not build TensorRT-LLM from source. The base used by the rendered Dockerfile is set in container/context.yaml:
For --target=runtime (the focus of this guide), container/render.py emits a stage that:
- Starts from
${RUNTIME_IMAGE}:${RUNTIME_IMAGE_TAG}(the upstream TRT-LLM image), - Builds the Dynamo wheels (
ai-dynamo,ai-dynamo-runtime, optionallykvbm) in a separatewheel_buildermanylinux stage, and - Installs those wheels into a
/opt/dynamo/venvvirtual environment created with--system-site-packagesso the upstream Python solve stays importable.
The dev and local-dev targets create /opt/dynamo/venv in container/templates/dev.Dockerfile and lay down the build toolchain (maturin, uv) so the user can run maturin develop against a mounted workspace at runtime for an editable install — the Dynamo wheels in /opt/dynamo/wheelhouse/ are left uninstalled in these targets.
The upstream nvcr.io/nvidia/tensorrt-llm/release image ships a multi-arch manifest (linux/amd64 and linux/arm64), so the Dynamo TensorRT-LLM image can be built for either architecture.
Building the Default Image
Run the resulting image:
Pinning a Different Upstream TensorRT-LLM Release
To pick up a different upstream TRT-LLM release (newer rc tag, a hotfix tag, etc.) without editing context.yaml, override the runtime image ARGs at docker build time:
--pull is recommended when changing only the upstream tag: without it, Docker may reuse a previously-cached layer that resolved against the old tag’s manifest, producing a half-stale image that boots but breaks at NIXL init if upstream moved bundled libraries between tags.
If a tag move changes where the upstream image installs libnixl.so or the NIXL plugin directory, the runtime stage’s test -f / test -d guards fail the build instead of producing a silently broken image. Update the LD_PRELOAD / NIXL_PLUGIN_DIR paths in container/templates/trtllm_runtime.Dockerfile and re-run render.py (otherwise container/rendered.Dockerfile is stale and the build silently uses the old paths) if that happens.
Building TensorRT-LLM From Source
Dynamo no longer builds TensorRT-LLM itself. If you need a custom TRT-LLM build (your own patch, a non-released commit, etc.), the supported path is to produce an upstream-equivalent image yourself and point Dynamo at it via the same RUNTIME_IMAGE / RUNTIME_IMAGE_TAG build args.
-
Build a TensorRT-LLM container following the upstream instructions: TensorRT-LLM — Build from Source on Linux. Use the “Building a TensorRT LLM Docker Image” section (specifically
make -C docker release_build) — this produces a container withtensorrt_llmin the system site-packages and the bundledlibnixl.soat the canonical path. A barebuild_wheel.pyinvocation only produces a wheel and won’t have the NIXL bits.Three constraints the resulting image must satisfy or the Dynamo build will fail its sanity guards:
- Python 3.12 in the system Python (
/usr/local/lib/python3.12/dist-packages/...) — theLD_PRELOADandNIXL_PLUGIN_DIRpaths in the runtime Dockerfile are hardcoded to 3.12. If your custom build switches to a different Python minor version, edit those env vars incontainer/templates/trtllm_runtime.Dockerfileand re-render. tensorrt_llminstalled into system site-packages (not a venv), matching upstream layout.libnixl.sopresent at/usr/local/lib/python3.12/dist-packages/tensorrt_llm/libs/nixl/libnixl.soand plugins undernixl/plugins/.
- Python 3.12 in the system Python (
-
Tag it locally, e.g.
my-registry/tensorrt-llm:<commit-sha>(use the source commit you built so the tag carries provenance — pasting a literal:custommakes the image untraceable later). -
Render and build the Dynamo image against your custom base:
Do not add
--pullhere — your custom image only exists locally, and--pullwill make Docker try to fetch it from docker.io and fail.--pullis only useful when the base image lives in a remote registry (see the previous section on pinning an upstream tag).
If your custom build places TRT-LLM’s bundled NIXL at a different path (or uses a non-3.12 Python), edit the LD_PRELOAD and NIXL_PLUGIN_DIR env vars in container/templates/trtllm_runtime.Dockerfile (and the matching test -f/test -d guards), then re-run python container/render.py ... to regenerate container/rendered.Dockerfile before docker build — otherwise the build silently uses the previously-rendered file. Those env vars exist to work around ai-dynamo/nixl#1668 — nixl-cu13’s bundled UCX 1.20.0 hangs under multi-agent init — by forcing every process in the image to load TRT-LLM’s 0.9.0 libnixl.so instead.