Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Framework Inference

For DreamBooth, the inference script generates images from text prompts defined in the config file, similar to section 5.7.3. Note that, dreambooth is a fine-tuning model based on diffusion models to link a special token with certain subject, so make sure the special token you trained on is included in the text prompt. For example, a photo of sks dog sleeping.

To enable the inference stage with DreamBooth, configure the configuration files:

  1. In the defaults section of conf/config.yaml, update the fw_inference field to point to the desired DreamBooth inference configuration file. For example, if you want to use the dreambooth/text2img.yaml configuration, change the fw_inference field to dreambooth/text2img.

    defaults:
      - fw_inference: dreambooth/text2img
      ...
    
  2. In the stages field of conf/config.yaml, make sure the fw_inference stage is included. For example,

    stages:
      - fw_inference
      ...
    
  3. Configure prompts and num_images_per_prompt fields of conf/fw_inference/dreambooth/text2img.yaml. Set model.restore_from_path to the ckpt generated from dreambooth training.

Remarks:

Please refer to DreamBooth Training , the inference stage of DreamBooth should be conducted subsequent to the DreamBooth conversion process. This conversion transforms the DreamBooth ckpt into a ‘.nemo’ format and meanwhile remapping the parameter keys into Stable Diffusion style, allowing for a consistent inference pipeline.