Common Component Combinations#

These are typical adoption patterns showing how components complement each other:

Use Case

Core Components

Optional Additions

Basic ML Inference

TensorRT + Triton

DALI, GPU Operator

Speech/NLP Pipeline

Riva SDK + Triton

DALI, TensorRT

Single-Node LLM

TensorRT-LLM + Dynamo

Model Optimizer

Distributed LLM

Dynamo + KV Block Manager + NIXL + Router + TensorRT-LLM

Planner, Model Express, Model Optimizer

Kubernetes Deployment

GPU Operator + KAI Scheduler

Network Operator, Grove

Full GenAI Stack

Dynamo + NIXL + KV Block Manager + Router + Grove + KAI Scheduler + Planner

AIConfigurator, AIPerf

Architecture Overview#

NVIDIA Inference Architecture Overview

NVIDIA Inference Architecture Overview#