nemo_automodel.checkpoint._backports._fsspec_filesystem
#
Module Contents#
Classes#
Basic implementation of StorageWriter using FFspec. |
|
Data#
API#
- nemo_automodel.checkpoint._backports._fsspec_filesystem.__all__#
[‘FsspecWriter’, ‘FsspecReader’]
- class nemo_automodel.checkpoint._backports._fsspec_filesystem.FileSystem[source]#
Bases:
nemo_automodel.checkpoint._backports.filesystem.FileSystemBase
- class nemo_automodel.checkpoint._backports._fsspec_filesystem.FsspecWriter(
- path: Union[str, os.PathLike],
- single_file_per_rank: bool = True,
- sync_files: bool = True,
- thread_count: int = 1,
- per_thread_copy_ahead: int = 10000000,
- overwrite: bool = True,
- _extensions: Optional[collections.abc.Sequence[torch.distributed.checkpoint._extension.StreamTransformExtension]] = None,
- serialization_format: nemo_automodel.checkpoint._backports.filesystem.SerializationFormat = SerializationFormat.TORCH_SAVE,
- **kwargs,
Bases:
nemo_automodel.checkpoint._backports.filesystem.FileSystemWriter
Basic implementation of StorageWriter using FFspec.
This implementation makes the following assumptions and simplifications:
The checkpoint path is an empty or non-existing directory.
File creation is atomic
The checkpoint consist of one file per write request plus a
.metadata
file with the serialized metadata.Initialization
Initialize the writer pointing to
path
.- Parameters:
path – directory where the checkpoint will be written to.
single_file_per_rank – Produce one file per rank instead of one file per tensor/blob. Default to True.
sync_files – force files to be synced to permanent storage. Default to True.
thread_count – Number of IO threads to use to write. Default to 1.
per_thread_copy_ahead – How many bytes to copy from the GPU ahead of saving then. Default 10Mb.
overwrite – Whether to allow overwriting existing checkpoints. Defaults to True.
_extensions – Extensions to apply to output streams (EXPERIMENTAL)
N. B. If sync_files is disabled, there’s no guarantee that the checkpoint will be consistent in the case of a failure.
- class nemo_automodel.checkpoint._backports._fsspec_filesystem.FsspecReader(path: Union[str, os.PathLike], **kwargs)[source]#
Bases:
nemo_automodel.checkpoint._backports.filesystem.FileSystemReader