`nemo_rl.models.megatron.community_import`#

Module Contents#

Functions#

`to_torch_dtype`
`_prefer_nvrx_for_dist_ckpt_save`	Prefer NVRx async strategy for torch_dist save in HF->Megatron import.
`import_model_from_hf_name`	Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path.
`export_model_from_megatron`

API#

nemo_rl.models.megatron.community_import.to_torch_dtype(dtype: str | torch.dtype) → torch.dtype#

nemo_rl.models.megatron.community_import._prefer_nvrx_for_dist_ckpt_save()#

Prefer NVRx async strategy for torch_dist save in HF->Megatron import.

Megatron-LM’s torch_dist sync save currently routes through the MCore async finalize path, which can fail when write results contain non-picklable objects (e.g., code objects) during gather_object.

nemo_rl.models.megatron.community_import.import_model_from_hf_name(

hf_model_name: str,

output_path: str,

megatron_config: Optional[nemo_rl.models.policy.MegatronConfig] = None,

model_post_wrap_hook: Optional[Callable] = None,

transformer_layer_spec: Optional[megatron.core.transformer.ModuleSpec | Callable] = None,

mamba_stack_spec: Optional[megatron.core.transformer.ModuleSpec | Callable] = None,

**config_overrides: Any,

)#

Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path.

Parameters:

hf_model_name – Hugging Face model ID or local path (e.g., ‘meta-llama/Llama-3.1-8B-Instruct’).
output_path – Directory to write the Megatron checkpoint (e.g., /tmp/megatron_ckpt).
megatron_config – Optional megatron config with parallelism settings for distributed megatron model import.
model_post_wrap_hook – Optional callable invoked on each Megatron model chunk after it is built (and before DDP wrapping). Forwarded to provide_distributed_model(post_wrap_hook=...).
transformer_layer_spec – Optional Megatron ModuleSpec (or callable returning one) overriding the default layer spec selected by the model provider.
mamba_stack_spec – Optional Megatron ModuleSpec (or callable returning one) overriding the default Mamba stack spec selected by Mamba model providers.
**config_overrides – Extra keyword arguments forwarded to AutoBridge.from_hf_pretrained.

nemo_rl.models.megatron.community_import.export_model_from_megatron( hf_model_name: str, input_path: str, output_path: str, hf_tokenizer_path: str, overwrite: bool = False, hf_overrides: Optional[dict[str, Any]] = {}, strict: bool = True, )#

nemo_rl.models.megatron.community_import#

Module Contents#

Functions#

API#

`nemo_rl.models.megatron.community_import`#