Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Training with Predefined Configurations

ControlNet essentially performs tuning on top of an existing Stable Diffusion checkpoint. The recommended configuration can be found in the conf/training/controlnet directory. You can access and modify the parameters to customize the hyperparameters according to your specific training requirements.

To enable the training stage with an ControlNet model, configure the configuration files:

  1. In the defaults section of conf/config.yaml, update the training field to point to the desired InstructPix2Pix configuration file. For example, if you want to use the controlnet_v1-5.yaml, change the training field to controlnet/controlnet_v1-5.yaml.

    defaults:
      - _self_
      - cluster: bcm
      - data_preparation: null
      - training: controlnet/controlnet_v1-5.yaml
      ...
    
  2. In the stages field of conf/config.yaml, make sure the training stage is included. For example,

    stages:
      - data_preparation
      - training
      ...
    

Remarks:

ControlNet copies encoder and middle blocks from Stable Diffusion and finetune a copy of these blocks, thus providing a pretrained checkpoint of Stable Diffusion needs to be passed into the config file, for both control_stage_config.from_pretrained and unet_config.from_pretrained.