Automatic Mixed Precision
Deep Neural Network training has traditionally relied on IEEE single-precision format (32 bits), however with mixed precision, you can train with half precision while maintaining the network accuracy achieved with single precision. This technique of using both single- and half-precision representations is referred to as mixed precision technique.
Training with Automatic Mixed Precision (AMP) can reduce memory requirements for training models and speed up training on some models up to 3 times.
For additional details, see: https://developer.nvidia.com/automatic-mixed-precision
GPUs with Tensor Cores are needed to get full benefits from AMP. Therefore, Volta or Turing GPUs are recommended for AMP-enabled training tasks.
To enable AMP in Clara, use one of the following options:
Set the
use_amp
variable in config.json:
{
"epochs": 1240,
"num_training_epoch_per_valid": 20,
"learning_rate": 1e-4,
"use_amp": true,
...
Set the
TF_ENABLE_AUTO_MIXED_PRECISION
environment variable:
export TF_ENABLE_AUTO_MIXED_PRECISION=1
When use_amp
is not in config.json, the environment variable TF_ENABLE_AUTO_MIXED_PRECISION
will be
used to determine whether to enable AMP. When the environment variable is set to 1, AMP is enabled.
If it’s set to 0, AMP is disabled.
When exporting checkpoint files to frozen graphs, you can enable/disable AMP using TF_ENABLE_AUTO_MIXED_PRECISION
.
- The following use cases are not supported with AMP:
Saving checkpoint files with AMP enabled, then finetuning them with AMP disabled.
Saving checkpoint files with AMP disabled, then finetuning them with AMP enabled.
With AMP enabled, saving checkpoint files in single-gpu training, then finetuning them with multi-gpu.