Model Export to TensorRT-LLM

NVIDIA Docs Hub NVIDIA NeMo Framework User Guide Model Export to TensorRT-LLM

For ControlNet, the export script generates four different optimized inference models. The first model is the VAE Decoder, the second model is the UNet, the third model is the CLIP Encoder and the fourth is the control model.

In the defaults section of conf/config.yaml, update the export field to point to the desired ControlNet inference configuration file. For example, if you want to use the controlnet/export_controlnet.yaml configuration, change the export field to controlnet/export_controlnet.
Copy

Copied!
```
            
            defaults:
  - export: controlnet/export_controlnet
  ...
        
```

In the stages field of conf/config.yaml, make sure the export stage is included. For example,

Copy
Copied!

            
            stages:
  - export
  ...

Configure infer.num_images_per_prompt of the conf/export/controlnet/export_controlnet.yaml file to set the batch_size to use for the ONNX and NVIDIA TensorRT models.

Remarks:

To load a pretrained checkpoint for inference, set the restore_from_path field in the model section to the path of the pretrained checkpoint in .nemo format in conf/export/controlnet/export_controlnet.yaml.
Only num_images_per_prompt: 1 is supported for now.

Previous Framework Inference

Next Model Deployment