Important
You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.
Framework Inference#
For InstructPix2Pix models, our inference script processes an original image based on a provided edit prompt, modifies the image accordingly, and saves the edited image as a new file.
To enable the inference stage with a InstructPix2Pix model, configure the configuration files:
In the
defaultssection ofconf/config.yaml, update thefw_inferencefield to point to the desired Instruct Pix2Pix configuration file. For example, if you want to use theinstruct_pix2pix/edit_cliconfiguration, change thefw_inferencefield toinstruct_pix2pix/edit_cli.defaults: - fw_inference: instruct_pix2pix/edit_cli ...
In the
stagesfield ofconf/config.yaml, make sure thefw_inferencestage is included. For example,stages: - fw_inference ...
Configure the
editsection inconf/fw_inference/instruct_pix2pix/edit_cli.yaml. Most importantly, set theinputfield to the path of the original image for inference, and provide an edit prompt in thepromptfield. The script will generatenum_images_per_promptimages at once based on the provided prompt.edit: resolution: 512 steps: 100 input: ??? # path/to/input/picture outpath: ${fw_inference.run.results_dir} prompt: "" cfg_text: 7.5 cfg_image: 1.2 num_images_per_prompt: 8 combine_images: [2, 4] # [row, column], set to null if don't want to combine seed: 1234
Execute the launcher pipeline:
python3 main.py.
Remarks:
To load a pretrained checkpoint for inference, set the
restore_from_pathfield in themodelsection to the path of the pretrained checkpoint in.nemoformat inconf/fw_inference/vit/imagenet1k.yaml. By default, this field links to the.nemoformat checkpoint located in the ImageNet 1K fine-tuning checkpoints folder.We highly recommend users to use the same precision (i.e.,
trainer.precision) for inference as was used during training.Tips for getting better quality results: timothybrooks/instruct-pix2pix