For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Digest
  • Getting Started
    • Quickstart
    • Introduction
    • Local Installation
    • Building from Source
    • Contribution Guide
  • Resources
    • Support Matrix
    • Feature Matrix
    • Release Artifacts
    • Examples
  • Kubernetes Deployment
    • Deployment Guide
  • User Guides
    • KV Cache Aware Routing
    • Disaggregated Serving
    • KV Cache Offloading
    • Dynamo Benchmarking
    • Multimodal
    • Diffusion (Preview)
    • Tool Calling
    • LoRA Adapters
    • Agents
    • Observability (Local)
    • Fault Tolerance
    • Writing Python Workers
  • Backends
    • SGLang
    • TensorRT-LLM
    • vLLM
  • Components
    • Frontend
    • Router
    • Planner
    • Profiler
    • KVBM
  • Integrations
    • LMCache
    • SGLang HiCache
    • FlexKV
    • KV Events for Custom Engines
  • Design Docs
    • Overall Architecture
    • Architecture Flow
    • Disaggregated Serving
    • Distributed Runtime
    • Blog
  • Documentation
    • Dynamo Docs Guide
  • Additional Resources
      • Building a Custom Container
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoDocumentation
Digest
On this page
  • Building a Custom Container
Additional ResourcesTensorRT-LLM Details

Building a Custom TensorRT-LLM Container

||View as Markdown|
Edit this page
Previous

Dynamo Docs Guide

For the prebuilt container, see the TensorRT-LLM Quick Start.

Building a Custom Container

If you need to build a container from source (e.g., for custom modifications or a different CUDA version):

$# TensorRT-LLM uses git-lfs, which needs to be installed in advance.
$apt-get update && apt-get -y install git git-lfs
$
$# On an x86 machine:
$python container/render.py --framework=trtllm --target=runtime --output-short-filename --cuda-version=13.1
$docker build -t dynamo:trtllm-latest -f container/rendered.Dockerfile .
$
$# On an ARM machine:
$python container/render.py --framework=trtllm --target=runtime --platform=arm64 --output-short-filename --cuda-version=13.1
$docker build -t dynamo:trtllm-latest -f container/rendered.Dockerfile .

Run the custom container:

$./container/run.sh --framework trtllm -it