Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Framework Inference#
For text-to-image models, the inference script generates images from text prompts defined in the config file.
To enable the inference stage with Imagen, configure the configuration files:
In the
defaultssection ofconf/config.yaml, update thefw_inferencefield to point to the desired Stable Diffusion inference configuration file. For example, if you want to use theimagen/text2img.yamlconfiguration, change thefw_inferencefield toimagen/text2img.defaults: - fw_inference: imagen/text2img ...
In the
stagesfield ofconf/config.yaml, make sure thefw_inferencestage is included. For example,stages: - fw_inference ...
Configure
infer.textsandinfer.num_images_per_promptfields ofconf/fw_inference/imagen/text2img.yaml. Setmodel.customized_model.base_ckpt&sr256_ckpt&sr1024_ckptto the.nemockpt you want generate images with. Setinfer.target_resolutionto the desired resolution.
Remarks:
We provide both DDPM and EDM sampler. We recommend for EDM training, at least 30 steps of inference is required; for DDPM training, at least 250 steps of inference is required.