We measured the end to end training time of DreamFusions models on RTX-A6000-Ada and H100 cards, Using The following parameters:
Automated Mixed Precision (AMP) for FP16 computation.
The DreamFusion model was trained for 10,000 iterations, 2,000 iterations on the latent space and 8,000 iterations on the RGB space.
DreamFusion-DMTet was finetuned for 5,000 iterations.
Please note that the code provides multiple backend for NeRF, stable diffusion and renderers that were not covered in this table.
Model |
GPU Model |
Num GPUs |
Batch Size Per GPU |
NeRF backend |
Rendering backend |
Stable Diffusion backend |
Train time [sec] |
---|---|---|---|---|---|---|---|
DreamFusion | H100 | 1 | 1 | TorchNGP | TorchNGP | NeMo | 1327 (*) |
DreamFusion | RTX A6000 | 1 | 1 | TorchNGP | TorchNGP | NeMo | 990 |
DreamFusion-DMTet | H100 | 1 | 1 | TorchNGP | TorchNGP | NeMo | 699 (*) |
DreamFusion-DMTet | RTX A6000 | 1 | 1 | TorchNGP | TorchNGP | NeMo | 503 |
Note
There is a performance bug with UNet attention layers that is affecting H100 performance. This issue will be solved in an upcoming release.