Model Export to TensorRT-LLM

User Guide (Latest Version)

For text-to-image models, the export script generates three different optimized inference models. The first model is the VAE Decoder, the second model is the UNet, and the third model is the CLIP Encoder.

  1. In the defaults section of conf/config.yaml, update the export field to point to the desired Stable Diffusion inference configuration file. For example, if you want to use the stable_diffusion/export_stable_diffusion.yaml configuration, change the export field to stable_diffusion/export_stable_diffusion.


    defaults: - export: stable_diffusion/export_stable_diffusion ...

  2. In the stages field of conf/config.yaml, make sure the export stage is included. For example,


    stages: - export ...

  3. Configure infer.num_images_per_prompt of the conf/export/stable_diffusion/export_stable_diffusion.yaml file to set the batch_size to use for the ONNX and NVIDIA TensorRT models.


  1. To load a pretrained checkpoint for inference, set the restore_from_path field in the model section to the path of the pretrained checkpoint in .nemo format in conf/export/stable_diffusion/export_stable_diffusion.yaml.

Previous Framework Inference
Next Model Deployment
© | | | | | | |. Last updated on May 30, 2024.