Skip to main content
Ctrl+K
NeMo-RL - Home NeMo-RL - Home

NeMo-RL

NeMo-RL - Home NeMo-RL - Home

NeMo-RL

Table of Contents

🖥️ Environment Start

  • Run on Your Local Workstation
  • Set Up Clusters

🚀 E2E Examples

  • GRPO on DeepScaler
  • SFT on OpenMathInstruct-2

📚 Guides

  • Add New Models
  • Supervised Fine-Tuning in NeMo RL
  • Direct Preference Optimization in NeMo RL
  • An in-depth Walkthrough of GRPO in NeMo RL
  • GRPO on DeepScaler
  • Evaluation
  • Model Quirks

🐳 Containers

  • Build Docker Images

🛠️ Development

  • Test NeMo RL
  • Documentation Development
  • Debugging in NeMo RL
  • API Reference
    • nemo_rl
      • nemo_rl.utils
        • nemo_rl.utils.config
        • nemo_rl.utils.checkpoint
        • nemo_rl.utils.native_checkpoint
        • nemo_rl.utils.nvml
        • nemo_rl.utils.venvs
        • nemo_rl.utils.logger
        • nemo_rl.utils.timer
      • nemo_rl.experience
        • nemo_rl.experience.rollouts
      • nemo_rl.environments
        • nemo_rl.environments.utils
        • nemo_rl.environments.math_environment
        • nemo_rl.environments.interfaces
        • nemo_rl.environments.metrics
      • nemo_rl.data
        • nemo_rl.data.hf_datasets
        • nemo_rl.data.datasets
        • nemo_rl.data.interfaces
        • nemo_rl.data.llm_message_utils
      • nemo_rl.distributed
        • nemo_rl.distributed.ray_actor_environment_registry
        • nemo_rl.distributed.model_utils
        • nemo_rl.distributed.batched_data_dict
        • nemo_rl.distributed.virtual_cluster
        • nemo_rl.distributed.worker_groups
        • nemo_rl.distributed.collectives
      • nemo_rl.algorithms
        • nemo_rl.algorithms.sft
        • nemo_rl.algorithms.utils
        • nemo_rl.algorithms.loss_functions
        • nemo_rl.algorithms.grpo
        • nemo_rl.algorithms.dpo
        • nemo_rl.algorithms.interfaces
      • nemo_rl.converters
        • nemo_rl.converters.huggingface
        • nemo_rl.converters.megatron
      • nemo_rl.models
        • nemo_rl.models.dtensor
        • nemo_rl.models.huggingface
        • nemo_rl.models.policy
        • nemo_rl.models.generation
        • nemo_rl.models.megatron
        • nemo_rl.models.interfaces
      • nemo_rl.metrics
        • nemo_rl.metrics.metrics_utils
      • nemo_rl.evals
        • nemo_rl.evals.eval
      • nemo_rl.package_info

📐 Design Docs

  • Design and Philosophy
  • Padding in NeMo RL
  • Logger
  • uv in NeMo RL
  • Data Format
  • Generation Interface
  • Checkpointing with Hugging Face Models
  • Loss functions in NeMo RL
  • API Reference
  • nemo_rl
  • nemo_rl.utils

nemo_rl.utils#

Submodules#

  • nemo_rl.utils.config
  • nemo_rl.utils.checkpoint
  • nemo_rl.utils.native_checkpoint
  • nemo_rl.utils.nvml
  • nemo_rl.utils.venvs
  • nemo_rl.utils.logger
  • nemo_rl.utils.timer

previous

nemo_rl

next

nemo_rl.utils.config

On this page
  • Submodules
NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2025, NVIDIA Corporation.