Model Export to TensorRT-LLM

NVIDIA Docs Hub NVIDIA NeMo Framework User Guide Model Export to TensorRT-LLM

For InstructPix2Pix models, the export script generates four different optimized inference models. The first model is the VAE Decoder, the second model is the UNet, the third model is the CLIP Encoder, and the fourth model is the VAE Encoder.

In the defaults section of conf/config.yaml, update the export field to point to the desired Stable Diffusion inference configuration file. For example, if you want to use the instruct_pix2pix/export_instruct_pix2pix.yaml configuration, change the export field to instruct_pix2pix/export_instruct_pix2pix.
Copy

Copied!
```
            
            defaults:
  - export: instruct_pix2pix/export_instruct_pix2pix
  ...
        
```

In the stages field of conf/config.yaml, make sure the export stage is included. For example,

Copy
Copied!

            
            stages:
  - export
  ...

Configure edit.num_images_per_prompt of the conf/export/instruct_pix2pix/export_instruct_pix2pix.yaml file to set the batch_size to use for the ONNX and NVIDIA TensorRT models.
Set a path to an example image to use in edit.input.

Remarks:

To load a pretrained checkpoint for inference, set the restore_from_path field in the model section to the path of the pretrained checkpoint in .nemo format in the conf/export/instruct_pix2pix/export_instruct_pix2pix.yaml file.

Previous Framework Inference

Next Model Deployment