Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Framework Inference
For text-to-image models, the inference script generates images from text prompts defined in the config file.
To enable the inference stage with Imagen, configure the configuration files:
In the
defaults
section ofconf/config.yaml
, update thefw_inference
field to point to the desired Stable Diffusion inference configuration file. For example, if you want to use theimagen/text2img.yaml
configuration, change thefw_inference
field toimagen/text2img
.defaults: - fw_inference: imagen/text2img ...
In the
stages
field ofconf/config.yaml
, make sure thefw_inference
stage is included. For example,stages: - fw_inference ...
Configure
infer.texts
andinfer.num_images_per_prompt
fields ofconf/fw_inference/imagen/text2img.yaml
. Setmodel.customized_model.base_ckpt&sr256_ckpt&sr1024_ckpt
to the.nemo
ckpt you want generate images with. Setinfer.target_resolution
to the desired resolution.
Remarks:
We provide both DDPM and EDM sampler. We recommend for EDM training, at least 30 steps of inference is required; for DDPM training, at least 250 steps of inference is required.