User Guide# Comprehensive guides for using Megatron Core and Megatron-LM. Quick Start Set Up Your Environment Write Your First Training Loop Review Advanced Examples Multi-Storage Client (MSC) Integration Installation Configuration File MSC URL Format Train from Object Storage Save and Load Checkpoints from Object Storage Disable MSC Performance Considerations Additional Resources and Advanced Configuration Data Preparation Data Format Preprocessing Data Output Files Using Preprocessed Data Common Tokenizers Training Examples Simple Training Example LLaMA-3 Training Examples GPT-3 Training Example Key Training Arguments Next Steps Parallelism Strategies Guide Overview Data Parallelism (DP) Tensor Parallelism (TP) Pipeline Parallelism (PP) Context Parallelism (CP) Expert Parallelism (EP) Parallelism Selection Guide Combining Strategies Performance Optimizations Choosing the Right Strategy Next Steps Advanced Features Mixture of Experts Megatron Core MoE User Guide Performance Best Practice context_parallel package Megatron FSDP Distributed Optimizer Optimizer CPU Offload Custom Pipeline Model Parallel Layout Tokenizers Megatron Energon Megatron RL