For text-to-image models, the export script generates three different optimized inference models. The first model is the VAE Decoder, the second model is the UNet, and the third model is the CLIP Encoder.
In the
defaults
section ofconf/config.yaml
, update theexport
field to point to the desired Stable Diffusion inference configuration file. For example, if you want to use thestable_diffusion/export_stable_diffusion.yaml
configuration, change theexport
field tostable_diffusion/export_stable_diffusion
.defaults: - export: stable_diffusion/export_stable_diffusion ...
In the
stages
field ofconf/config.yaml
, make sure theexport
stage is included. For example,stages: - export ...
Configure
infer.num_images_per_prompt
of theconf/export/stable_diffusion/export_stable_diffusion.yaml
file to set the batch_size to use for the ONNX and NVIDIA TensorRT models.
Remarks:
To load a pretrained checkpoint for inference, set the
restore_from_path
field in themodel
section to the path of the pretrained checkpoint in.nemo
format inconf/export/stable_diffusion/export_stable_diffusion.yaml
.