`nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors`#

Module Contents#

Classes#

`_FqnData`	Dataclass to store information about a tensor (identified by its fully qualified name).
`_OutputFileData`	Dataclass to store information about an output safetensors file.
`_InputFileData`	Dataclass to store information about an input safetensors file.

Functions#

`_parse_input_metadata`	Parse metadata from input safetensors files to determine the full tensor shapes and types.
`_write_metadata`	Write metadata to the beginning of each output safetensors file.
`_read_tensor_data_mmap`	Read tensor data from a safetensors file using memory mapping for efficiency.
`_process_output_file`	Process a single output file by writing tensor data from input files using memory mapping.
`_write_data`	Write tensor data from input files to the output files using memory mapping.
`_write_sub_tensor_to_file_optimized`	Optimized version that writes the maximum number of contiguous bytes possible.
`_calculate_max_contiguous_elements`	Calculate the maximum number of contiguous elements that can be written from current position.
`_write_overall_metadata_file`	Write the overall metadata file that maps tensor names to their file locations.
`_consolidate_safetensors_files`
`consolidate_safetensors_files`	Main function to consolidate sharded safetensors files into one or more output files.
`consolidate_safetensors_files_on_every_rank`	Consolidate sharded safetensors files across multiple ranks, with each rank handling a subset of output files.

Data#

`logger`
`GLOBAL_OUTPUT_FILES_DATA`

API#

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.logger: logging.Logger#: ‘getLogger(…)’

class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._FqnData#

Dataclass to store information about a tensor (identified by its fully qualified name).

.. attribute:: offset_in_file

Byte offset where this tensor’s data begins in the output file

.. attribute:: shape_in_file

Shape of the tensor in the output file

.. attribute:: dtype_size

Size of the tensor’s data type in bytes

.. attribute:: dtype_str

String representation of the tensor’s data type

offset_in_file: int#: 0

shape_in_file: list[int]#: ‘field(…)’

dtype_size: int#: 0

dtype_str: str = <Multiline-String>#

class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData#

Dataclass to store information about an output safetensors file.

.. attribute:: metadata_size

Size of the metadata section in bytes

.. attribute:: fqn_data

Dictionary mapping tensor names to their metadata

metadata_size: int#: 0

fqn_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._FqnData]#: ‘field(…)’

class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData#

Dataclass to store information about an input safetensors file.

.. attribute:: metadata_size

Size of the metadata section in bytes

.. attribute:: metadata

Json metadata from the safetensors file

metadata_size: int#: 0

metadata: Any#: None

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.GLOBAL_OUTPUT_FILES_DATA: Optional[dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData]]#: None

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._parse_input_metadata( input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData], output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData], ) → None#

Parse metadata from input safetensors files to determine the full tensor shapes and types.

This function analyzes the metadata from all input files to determine the complete shape of each tensor after consolidation. It updates the output_files_data with this information.

Parameters:

input_files_data – dict of metadata from input safetensors files
output_files_data – Dictionary mapping output file paths to their metadata

Raises:

ValueError – If no DCP custom metadata is found in a safetensors file

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_metadata( output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData], ) → None#

Write metadata to the beginning of each output safetensors file.

This function writes the metadata section to each output file, including information about tensor shapes, data types, and offsets. It also updates the offset_in_file field for each tensor in the output_files_data.

Parameters:: output_files_data – Dictionary mapping output file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._read_tensor_data_mmap( file_path: str, start_offset: int, end_offset: int, metadata_size: int, ) → bytes#

Read tensor data from a safetensors file using memory mapping for efficiency.

Parameters:

file_path – Path to the safetensors file
start_offset – Start offset of tensor data within the data section
end_offset – End offset of tensor data within the data section
metadata_size – Size of the metadata header

Returns:

Raw tensor data as bytes

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._process_output_file( output_file: str, output_data: nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData, input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData], ) → None#

Process a single output file by writing tensor data from input files using memory mapping.

This function is designed to be run in parallel for different output files.

Parameters:

output_file – Path to the output file
output_data – Metadata for the output file
input_files_data – Dictionary mapping input file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_data( input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData], output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData], num_threads: int = 1, ) → None#

Write tensor data from input files to the output files using memory mapping.

This function reads tensor data from each input file and writes it to the appropriate position in the output files based on the tensor’s offsets. When num_threads > 1, the work is split across threads with each thread handling a different output file.

Parameters:

input_files_data – Dictionary mapping input file paths to their metadata
output_files_data – Dictionary mapping output file paths to their metadata
num_threads – Number of threads to use for parallel processing

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_sub_tensor_to_file_optimized( full_tensor_mv: memoryview, sub_tensor_bytes: bytes, element_size: int, tensor_shape: list[int], sub_tensor_offsets: list[int], sub_tensor_shape: list[int], ) → None#

Optimized version that writes the maximum number of contiguous bytes possible.

Uses a unified algorithm that calculates the maximum contiguous bytes that can be written in each iteration and continues until the entire subtensor is written. Handles all sharding patterns efficiently:

Full sub-tensor at once for row-wise sharding
Row-by-row for column-wise sharding
Optimized chunks for other patterns

Parameters:

full_tensor_mv – Buffer to write the full tensor to
sub_tensor_bytes – Raw tensor data as bytes
element_size – Size of each element in bytes
tensor_shape – Shape of the full tensor
sub_tensor_offsets – Starting offsets of the sub-tensor within the full tensor
sub_tensor_shape – Shape of the sub-tensor

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._calculate_max_contiguous_elements( indices: list[int], sub_tensor_shape: list[int], tensor_shape: list[int], ) → int#

Calculate the maximum number of contiguous elements that can be written from current position.

This determines the largest chunk by checking how elements are laid out in memory and finding natural boundaries where contiguity breaks.

Parameters:

indices – Current position indices in the sub-tensor
sub_tensor_shape – Shape of the sub-tensor being written
tensor_shape – Shape of the full tensor

Raises:

ValueError – If input lists are empty, have mismatched lengths, or contain invalid values

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_overall_metadata_file( output_dir: str, output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData], ) → None#

Write the overall metadata file that maps tensor names to their file locations.

This creates a model.safetensors.index.json file that HuggingFace models use to locate tensors across multiple files.

Parameters:

output_dir – Directory where the metadata file will be written
output_files_data – Dictionary mapping output file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._consolidate_safetensors_files( input_dir: str, output_dir: str, fqn_to_file_mapping: dict[str, str], num_threads: int, ) → dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData]#

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.consolidate_safetensors_files( input_dir: str, output_dir: str, fqn_to_index_mapping: dict[str, int], num_threads: int = 1, ) → None#

Main function to consolidate sharded safetensors files into one or more output files.

This function orchestrates the entire consolidation process:

Sets up the output file structure based on the fqn_to_index_mapping
Finds all safetensors files in the input directory
Parses metadata from all input files
Writes metadata to the output files
Writes tensor data from input files to output files
Writes overall model.index.safetensors.json file with weight map

Parameters:

input_dir – Directory containing sharded safetensors files
output_dir – Directory where consolidated files will be written
fqn_to_index_mapping – Optional mapping of tensor names to output file indices. If None, all tensors will be consolidated into a single file.
num_threads – Number of threads to use for parallel processing of saving data to output files.

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.consolidate_safetensors_files_on_every_rank( input_dir: str, output_dir: str, fqn_to_index_mapping: dict[str, int], num_threads: int = 1, process_group: Optional[torch.distributed.ProcessGroup] = None, ) → None#

Consolidate sharded safetensors files across multiple ranks, with each rank handling a subset of output files.

This function distributes the consolidation work by assigning output files to different ranks. All tensors with the same index in fqn_to_index_mapping are processed by the same rank, as they belong to the same output file.

If process_group is provided, rank and world_size will be derived from it. Otherwise, they will be automatically detected from the distributed environment if available.

Parameters:

input_dir – Directory containing sharded safetensors files
output_dir – Directory where consolidated files will be written
fqn_to_index_mapping – Mapping of tensor names to output file indices
num_threads – Number of threads to use for parallel processing on each rank
process_group – PyTorch distributed process group (default: None, will use default group)

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors#

Module Contents#

Classes#

Functions#

Data#

API#

`nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors`#