Performance

Traning Performace Results

We measured the end to end training time of DreamFusions models on RTX-A6000-Ada and H100 cards, Using The following parameters:

  • Automated Mixed Precision (AMP) for FP16 computation.

  • The DreamFusion model was trained for 10,000 iterations, 2,000 iterations on the latent space and 8,000 iterations on the RGB space.

  • DreamFusion-DMTet was finetuned for 5,000 iterations.

Please note that the code provides multiple backend for NeRF, stable diffusion and renderers that were not covered in this table.

Model

GPU Model

Num GPUs

Batch Size Per GPU

NeRF backend

Rendering backend

Stable Diffusion backend

Train time [sec]

DreamFusion

H100

1

1

TorchNGP

TorchNGP

NeMo

1327 (*)

DreamFusion

RTX A6000

1

1

TorchNGP

TorchNGP

NeMo

990

DreamFusion-DMTet

H100

1

1

TorchNGP

TorchNGP

NeMo

699 (*)

DreamFusion-DMTet

RTX A6000

1

1

TorchNGP

TorchNGP

NeMo

503

Note

There is a performance bug with UNet attention layers that is affecting H100 performance. This issue will be solved in an upcoming release.