nemo_rl.models.megatron.community_import#

Module Contents#

Functions#

to_torch_dtype

import_model_from_hf_name

Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path.

export_model_from_megatron

API#

nemo_rl.models.megatron.community_import.to_torch_dtype(dtype: str | torch.dtype) torch.dtype#
nemo_rl.models.megatron.community_import.import_model_from_hf_name(
hf_model_name: str,
output_path: str,
megatron_config: Optional[nemo_rl.models.policy.MegatronConfig] = None,
model_post_wrap_hook: Optional[Callable] = None,
transformer_layer_spec: Optional[megatron.core.transformer.ModuleSpec | Callable] = None,
**config_overrides: Any,
)#

Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path.

Parameters:
  • hf_model_name – Hugging Face model ID or local path (e.g., ‘meta-llama/Llama-3.1-8B-Instruct’).

  • output_path – Directory to write the Megatron checkpoint (e.g., /tmp/megatron_ckpt).

  • megatron_config – Optional megatron config with parallelism settings for distributed megatron model import.

  • model_post_wrap_hook – Optional callable invoked on each Megatron model chunk after it is built (and before DDP wrapping). Forwarded to provide_distributed_model(post_wrap_hook=...).

  • transformer_layer_spec – Optional Megatron ModuleSpec (or callable returning one) overriding the default layer spec selected by the model provider.

  • **config_overrides – Extra keyword arguments forwarded to AutoBridge.from_hf_pretrained.

nemo_rl.models.megatron.community_import.export_model_from_megatron(
hf_model_name: str,
input_path: str,
output_path: str,
hf_tokenizer_path: str,
overwrite: bool = False,
hf_overrides: Optional[dict[str, Any]] = {},
strict: bool = True,
)#