Getting Started#

Choose the components that fit your needs. Here are common adoption paths:

Full Stack Deployment#

Infrastructure: Deploy GPU Operator and Network Operator to your Kubernetes cluster
Containers: Pull optimized containers from nvcr.io
Optimize: Use Model Optimizer with TensorRT or TensorRT-LLM
Plan: Use AIConfigurator to estimate performance and plan deployment topology
Deploy: Use KAI Scheduler (add Grove for multinode) to deploy Triton or Dynamo
Tune: Use AIPerf for benchmarking, Planner for runtime optimization