Installation#

System Requirements#

Hardware#

  • Recommended: NVIDIA Turing architecture or later

  • FP8 Support: Requires NVIDIA Hopper, Ada, or Blackwell GPUs

Software#

  • Python: >= 3.10 (3.12 recommended)

  • PyTorch: >= 2.6.0

  • CUDA Toolkit: Latest stable version

Prerequisites#

Install uv, a fast Python package installer:

curl -LsSf https://astral.sh/uv/install.sh | sh

Option B: Install from Source#

For development or to run the latest unreleased code:

git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
uv pip install -e .

To install with all development dependencies (includes Transformer Engine, requires pre-installed build deps):

uv pip install --group build
uv pip install --no-build-isolation -e ".[training,dev]"

Tip

If the build runs out of memory, limit parallel compilation jobs with MAX_JOBS=4 uv pip install --no-build-isolation -e ".[training,dev]".

Option C: NGC Container#

For a pre-configured environment with all dependencies pre-installed (PyTorch, CUDA, cuDNN, NCCL, Transformer Engine), use the PyTorch NGC Container.

We recommend using the previous month’s NGC container rather than the latest one to ensure compatibility with the current Megatron Core release and testing matrix.

docker run --gpus all -it --rm \
  -v /path/to/dataset:/workspace/dataset \
  -v /path/to/checkpoints:/workspace/checkpoints \
  -e PIP_CONSTRAINT= \
  nvcr.io/nvidia/pytorch:26.01-py3

Note

The NGC PyTorch container constrains the Python environment globally via PIP_CONSTRAINT. The -e PIP_CONSTRAINT= flag above unsets this so that Megatron Core and its dependencies install correctly.

Then install Megatron Core inside the container (torch is already available in the NGC image):

pip install uv
uv pip install --no-build-isolation "megatron-core[training,dev]"

You are now ready to run training. See Your First Training Run for next steps.