nemo_rl.models.megatron.community_import#
Module Contents#
Functions#
Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path. |
|
API#
- nemo_rl.models.megatron.community_import.to_torch_dtype(dtype: str | torch.dtype) torch.dtype#
- nemo_rl.models.megatron.community_import.import_model_from_hf_name(
- hf_model_name: str,
- output_path: str,
- megatron_config: Optional[nemo_rl.models.policy.MegatronConfig] = None,
- model_post_wrap_hook: Optional[Callable] = None,
- transformer_layer_spec: Optional[megatron.core.transformer.ModuleSpec | Callable] = None,
- **config_overrides: Any,
Import a Hugging Face model into Megatron checkpoint format and save the Megatron checkpoint to the output path.
- Parameters:
hf_model_name – Hugging Face model ID or local path (e.g., ‘meta-llama/Llama-3.1-8B-Instruct’).
output_path – Directory to write the Megatron checkpoint (e.g., /tmp/megatron_ckpt).
megatron_config – Optional megatron config with parallelism settings for distributed megatron model import.
model_post_wrap_hook – Optional callable invoked on each Megatron model chunk after it is built (and before DDP wrapping). Forwarded to
provide_distributed_model(post_wrap_hook=...).transformer_layer_spec – Optional Megatron
ModuleSpec(or callable returning one) overriding the default layer spec selected by the model provider.**config_overrides – Extra keyword arguments forwarded to
AutoBridge.from_hf_pretrained.
- nemo_rl.models.megatron.community_import.export_model_from_megatron(
- hf_model_name: str,
- input_path: str,
- output_path: str,
- hf_tokenizer_path: str,
- overwrite: bool = False,
- hf_overrides: Optional[dict[str, Any]] = {},
- strict: bool = True,