Deploy NeMo 2.0 MMs by Exporting to Inference Optimized Libraries#

NeMo Export-Deploy library offers scripts and APIs to export models to a inference optimized libraries, TensorRT-LLM, and to deploy the exported model with the NVIDIA Triton Inference Server.