nemo_automodel.checkpoint.checkpointing
#
Checkpoint management utilities for HF models.
Module Contents#
Classes#
Configuration for checkpointing. |
Functions#
Save a model state dictionary to a weights path. |
|
Load a model state dictionary from a weights path. |
|
Save an optimizer state dictionary to a weights path. |
|
Load an optimizer state dictionary from a weights path. |
|
Return the directory containing the first |
API#
- class nemo_automodel.checkpoint.checkpointing.CheckpointingConfig[source]#
Configuration for checkpointing.
- enabled: bool#
None
- checkpoint_dir: str | pathlib.Path#
None
- model_save_format: nemo_automodel.checkpoint._backports.filesystem.SerializationFormat | str#
None
- model_cache_dir: str | pathlib.Path#
None
- model_repo_id: str#
None
- save_consolidated: bool#
None
- is_peft: bool#
None
- nemo_automodel.checkpoint.checkpointing.save_model(
- model: torch.nn.Module | transformers.PreTrainedModel,
- weights_path: str,
- checkpoint_config: nemo_automodel.checkpoint.checkpointing.CheckpointingConfig,
Save a model state dictionary to a weights path.
This function can save a model in the following formats:
safetensors (in HF format)
torch_save (in DCP format)
- Parameters:
model β Model to save
weights_path β Path to save model weights
checkpoint_config β Checkpointing configuration
- nemo_automodel.checkpoint.checkpointing.load_model(
- model: torch.nn.Module | transformers.PreTrainedModel,
- weights_path: str,
- checkpoint_config: nemo_automodel.checkpoint.checkpointing.CheckpointingConfig,
Load a model state dictionary from a weights path.
- Parameters:
model β Model to load state into
weights_path β Path to load model weights from
checkpoint_config β Checkpointing configuration
- nemo_automodel.checkpoint.checkpointing.save_optimizer(
- optimizer: torch.optim.Optimizer,
- model: torch.nn.Module,
- weights_path: str,
- scheduler: Optional[Any] = None,
Save an optimizer state dictionary to a weights path.
- Parameters:
optimizer β Optimizer to save
model β Model to save optimizer state for
weights_path β Path to save optimizer weights
scheduler β Optional scheduler to save
- nemo_automodel.checkpoint.checkpointing.load_optimizer(
- optimizer: torch.optim.Optimizer,
- model: torch.nn.Module,
- weights_path: str,
- scheduler: Optional[Any] = None,
Load an optimizer state dictionary from a weights path.
- Parameters:
optimizer β Optimizer to load state into
model β Model to load optimizer state for
weights_path β Path to load optimizer weights from
scheduler β Optional scheduler to load state into
- nemo_automodel.checkpoint.checkpointing._get_safetensors_index_path(cache_dir: str, repo_id: str) str [source]#
Return the directory containing the first
model.safetensors.index.json
found for given model.If no
model.safetensors.index.json
is found then it returns None.For example, if the file located is
/opt/models/models--meta-llama--Llama-3.2-3B/snapshots/13afe.../model.safetensors.index.json
this function will return the directory path
/opt/models/models--meta-llama--Llama-3.2-3B/snapshots/13afe...
This will error if the model hasnβt been downloaded or if the cache directory is incorrect.
- Parameters:
cache_dir β Path to cache directory
repo_id β Hugging Face repository ID
- Returns:
Path to the directory containing the index file.
- Raises:
FileNotFoundError β If the index file is not found.