Automatic Mixed Precision#
TAO now supports Automatic-Mixed-Precision (AMP) training. DNN training has traditionally relied on training using the IEEE single-precision format for its tensors. With mixed precision training, however, you may use a mixture of FP16 and FP32 operations in the training graph to help speed up training without compromising accuracy. There are several benefits to using AMP:
Speed up math-intensive operations such as linear and convolution layers
Speed up memory-limited operations by accessing half the bytes compared to single-precision
Reduce memory requirements for training models, enabling larger models or larger minibatches
In TAO, enable AMP by setting train.use_amp: true in your experiment
specification. The agent surfaces this flag and dispatches training with FP16 tensor cores enabled. AMP is only supported on GPUs with Volta architecture or above.