core.post_training.modelopt#
Integrations with NVIDIA Model Optimizer (referred to as ModelOpt).
ModelOpt is a library comprising state-of-the-art model optimization techniques including quantization and sparsity to compress model for efficient inference on NVIDIA GPUs. ModelOpt is integrated with Megatron-core to provide a seamless experience for users to optimize their Megatron-core models for inference. More details on ModelOpt including installation and usage can be found at https://github.com/NVIDIA/Model-Optimizer.