Important
NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.
Framework Inference
For InstructPix2Pix models, our inference script processes an original image based on a provided edit prompt, modifies the image accordingly, and saves the edited image as a new file.
To enable the inference stage with a InstructPix2Pix model, configure the configuration files:
In the
defaultssection ofconf/config.yaml, update thefw_inferencefield to point to the desired Instruct Pix2Pix configuration file. For example, if you want to use theinstruct_pix2pix/edit_cliconfiguration, change thefw_inferencefield toinstruct_pix2pix/edit_cli.defaults: - fw_inference: instruct_pix2pix/edit_cli ...
In the
stagesfield ofconf/config.yaml, make sure thefw_inferencestage is included. For example,stages: - fw_inference ...
Configure the
editsection inconf/fw_inference/instruct_pix2pix/edit_cli.yaml. Most importantly, set theinputfield to the path of the original image for inference, and provide an edit prompt in thepromptfield. The script will generatenum_images_per_promptimages at once based on the provided prompt.edit: resolution: 512 steps: 100 input: ??? # path/to/input/picture outpath: ${fw_inference.run.results_dir} prompt: "" cfg_text: 7.5 cfg_image: 1.2 num_images_per_prompt: 8 combine_images: [2, 4] # [row, column], set to null if don't want to combine seed: 1234
Execute the launcher pipeline:
python3 main.py.
Remarks:
To load a pretrained checkpoint for inference, set the
restore_from_pathfield in themodelsection to the path of the pretrained checkpoint in.nemoformat inconf/fw_inference/vit/imagenet1k.yaml. By default, this field links to the.nemoformat checkpoint located in the ImageNet 1K fine-tuning checkpoints folder.We highly recommend users to use the same precision (i.e.,
trainer.precision) for inference as was used during training.Tips for getting better quality results: https://github.com/timothybrooks/instruct-pix2pix#tips