For text-to-image models, the export script generates two different optimized inference models. The first model is the UNet, and the second model is the T5 encoder. The script generates separate UNet model for different resolutions (e.g. 64x64, 256x256, 1024x1024)
In the
defaults
section ofconf/config.yaml
, update theexport
field to point to the desired Stable Diffusion inference configuration file. For example, if you want to use theimagen/export_imagen.yaml
configuration, change theexport
field toimagen/export_imagen
.defaults: - export: imagen/export_imagen ...
In the
stages
field ofconf/config.yaml
, make sure theexport
stage is included. For example,stages: - export ...
Configure
infer.num_images_per_prompt
of theconf/export/imagen/export_imagen.yaml
file to set thebatch_size
to use for the ONNX and NVIDIA TensorRT models.
Remarks:
To load a pretrained checkpoint for inference, set the
base_ckpt
,sr256_ckpt
,sr1024_ckpt
field in themodel.customized_model
section to the path of the pretrained checkpoint in.nemo
format inconf/export/imagen/export_imagen.yaml
. Make suremodel.target_resolution
is set to desired resolution.