Automatic mixed precision

Deep Neural Network training has traditionally relied on IEEE single-precision format (32 bits), however with mixed precision, you can train with half precision while maintaining the network accuracy achieved with single precision. This technique of using both single- and half-precision representations is referred to as mixed precision technique.

Training with automatic mixed precision can reduce memory requirements for training models and speed up training on some models up to 3X.

For additional details, see: https://developer.nvidia.com/automatic-mixed-precision

Note

GPUs with Tensor Cores are needed to get full benefits from AMP. Therefore, Volta or Turing GPUs are recommended for AMP-enabled training tasks.

Enabling Automatic Mixed Precision in Clara

The underlying framework and container handles everything needed so the only thing that needs to be set is the use_amp variable in config.json:

{
   "epochs": 1240,
   "num_training_epoch_per_valid": 20,
   "learning_rate": 1e-4,
   "use_amp": true,
    ...

When use_amp is not in config.json, the environment variable TF_ENABLE_AUTO_MIXED_PRECISION will determine whether to enable AMP. When the environment variable is set to 1, AMP is enabled. If it’s set to 0, AMP is disabled.

When exporting checkpoint files to frozen graphs, neither config_train nor config_validation json files are used. You can control AMP with TF_ENABLE_AUTO_MIXED_PRECISION as described above.

The following use cases are not supported with AMP:

  • Saving checkpoint files with AMP enabled, then finetuning them with AMP disabled.

  • Saving checkpoint files with AMP disabled, then finetuning them with AMP enabled.

  • With AMP enabled, saving checkpoint files in single-gpu training, then finetuning them with multi-gpu.