Backends#

NVIDIA Dynamo supports multiple inference backends to provide flexibility and performance optimization for different use cases and model architectures. Backends are the underlying engines that execute AI model inference, each optimized for specific scenarios, hardware configurations, and performance requirements.

Overview#

Dynamo’s multi-backend architecture allows you to:

  • Choose the optimal engine for your specific workload and hardware

  • Switch between backends without changing your application code

  • Leverage specialized optimizations from each backend

  • Scale flexibly across different deployment scenarios

Supported Backends#

Dynamo currently supports the following high-performance inference backends: