nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors#

Module Contents#

Classes#

_FqnData

Dataclass to store information about a tensor (identified by its fully qualified name).

_OutputFileData

Dataclass to store information about an output safetensors file.

_InputFileData

Dataclass to store information about an input safetensors file.

Functions#

_parse_input_metadata

Parse metadata from input safetensors files to determine the full tensor shapes and types.

_write_metadata

Write metadata to the beginning of each output safetensors file.

_read_tensor_data_mmap

Read tensor data from a safetensors file using memory mapping for efficiency.

_process_output_file

Process a single output file by writing tensor data from input files using memory mapping.

_write_data

Write tensor data from input files to the output files using memory mapping.

_write_sub_tensor_to_file_optimized

Optimized version that writes the maximum number of contiguous bytes possible.

_calculate_max_contiguous_elements

Calculate the maximum number of contiguous elements that can be written from current position.

_write_overall_metadata_file

Write the overall metadata file that maps tensor names to their file locations.

_consolidate_safetensors_files

consolidate_safetensors_files

Main function to consolidate sharded safetensors files into one or more output files.

consolidate_safetensors_files_on_every_rank

Consolidate sharded safetensors files across multiple ranks, with each rank handling a subset of output files.

Data#

API#

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.logger: logging.Logger#

‘getLogger(…)’

class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._FqnData#

Dataclass to store information about a tensor (identified by its fully qualified name).

.. attribute:: offset_in_file

Byte offset where this tensor’s data begins in the output file

.. attribute:: shape_in_file

Shape of the tensor in the output file

.. attribute:: dtype_size

Size of the tensor’s data type in bytes

.. attribute:: dtype_str

String representation of the tensor’s data type

offset_in_file: int#

0

shape_in_file: list[int]#

‘field(…)’

dtype_size: int#

0

dtype_str: str = <Multiline-String>#
class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData#

Dataclass to store information about an output safetensors file.

.. attribute:: metadata_size

Size of the metadata section in bytes

.. attribute:: fqn_data

Dictionary mapping tensor names to their metadata

metadata_size: int#

0

fqn_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._FqnData]#

‘field(…)’

class nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData#

Dataclass to store information about an input safetensors file.

.. attribute:: metadata_size

Size of the metadata section in bytes

.. attribute:: metadata

Json metadata from the safetensors file

metadata_size: int#

0

metadata: Any#

None

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.GLOBAL_OUTPUT_FILES_DATA: Optional[dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData]]#

None

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._parse_input_metadata(
input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData],
output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData],
) None#

Parse metadata from input safetensors files to determine the full tensor shapes and types.

This function analyzes the metadata from all input files to determine the complete shape of each tensor after consolidation. It updates the output_files_data with this information.

Parameters:
  • input_files_data – dict of metadata from input safetensors files

  • output_files_data – Dictionary mapping output file paths to their metadata

Raises:

ValueError – If no DCP custom metadata is found in a safetensors file

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_metadata(
output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData],
) None#

Write metadata to the beginning of each output safetensors file.

This function writes the metadata section to each output file, including information about tensor shapes, data types, and offsets. It also updates the offset_in_file field for each tensor in the output_files_data.

Parameters:

output_files_data – Dictionary mapping output file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._read_tensor_data_mmap(
file_path: str,
start_offset: int,
end_offset: int,
metadata_size: int,
) bytes#

Read tensor data from a safetensors file using memory mapping for efficiency.

Parameters:
  • file_path – Path to the safetensors file

  • start_offset – Start offset of tensor data within the data section

  • end_offset – End offset of tensor data within the data section

  • metadata_size – Size of the metadata header

Returns:

Raw tensor data as bytes

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._process_output_file(
output_file: str,
output_data: nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData,
input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData],
) None#

Process a single output file by writing tensor data from input files using memory mapping.

This function is designed to be run in parallel for different output files.

Parameters:
  • output_file – Path to the output file

  • output_data – Metadata for the output file

  • input_files_data – Dictionary mapping input file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_data(
input_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._InputFileData],
output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData],
num_threads: int = 1,
) None#

Write tensor data from input files to the output files using memory mapping.

This function reads tensor data from each input file and writes it to the appropriate position in the output files based on the tensor’s offsets. When num_threads > 1, the work is split across threads with each thread handling a different output file.

Parameters:
  • input_files_data – Dictionary mapping input file paths to their metadata

  • output_files_data – Dictionary mapping output file paths to their metadata

  • num_threads – Number of threads to use for parallel processing

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_sub_tensor_to_file_optimized(
full_tensor_mv: memoryview,
sub_tensor_bytes: bytes,
element_size: int,
tensor_shape: list[int],
sub_tensor_offsets: list[int],
sub_tensor_shape: list[int],
) None#

Optimized version that writes the maximum number of contiguous bytes possible.

Uses a unified algorithm that calculates the maximum contiguous bytes that can be written in each iteration and continues until the entire subtensor is written. Handles all sharding patterns efficiently:

  • Full sub-tensor at once for row-wise sharding

  • Row-by-row for column-wise sharding

  • Optimized chunks for other patterns

Parameters:
  • full_tensor_mv – Buffer to write the full tensor to

  • sub_tensor_bytes – Raw tensor data as bytes

  • element_size – Size of each element in bytes

  • tensor_shape – Shape of the full tensor

  • sub_tensor_offsets – Starting offsets of the sub-tensor within the full tensor

  • sub_tensor_shape – Shape of the sub-tensor

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._calculate_max_contiguous_elements(
indices: list[int],
sub_tensor_shape: list[int],
tensor_shape: list[int],
) int#

Calculate the maximum number of contiguous elements that can be written from current position.

This determines the largest chunk by checking how elements are laid out in memory and finding natural boundaries where contiguity breaks.

Parameters:
  • indices – Current position indices in the sub-tensor

  • sub_tensor_shape – Shape of the sub-tensor being written

  • tensor_shape – Shape of the full tensor

Raises:

ValueError – If input lists are empty, have mismatched lengths, or contain invalid values

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._write_overall_metadata_file(
output_dir: str,
output_files_data: dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData],
) None#

Write the overall metadata file that maps tensor names to their file locations.

This creates a model.safetensors.index.json file that HuggingFace models use to locate tensors across multiple files.

Parameters:
  • output_dir – Directory where the metadata file will be written

  • output_files_data – Dictionary mapping output file paths to their metadata

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._consolidate_safetensors_files(
input_dir: str,
output_dir: str,
fqn_to_file_mapping: dict[str, str],
num_threads: int,
) dict[str, nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors._OutputFileData]#
nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.consolidate_safetensors_files(
input_dir: str,
output_dir: str,
fqn_to_index_mapping: dict[str, int],
num_threads: int = 1,
) None#

Main function to consolidate sharded safetensors files into one or more output files.

This function orchestrates the entire consolidation process:

  1. Sets up the output file structure based on the fqn_to_index_mapping

  2. Finds all safetensors files in the input directory

  3. Parses metadata from all input files

  4. Writes metadata to the output files

  5. Writes tensor data from input files to output files

  6. Writes overall model.index.safetensors.json file with weight map

Parameters:
  • input_dir – Directory containing sharded safetensors files

  • output_dir – Directory where consolidated files will be written

  • fqn_to_index_mapping – Optional mapping of tensor names to output file indices. If None, all tensors will be consolidated into a single file.

  • num_threads – Number of threads to use for parallel processing of saving data to output files.

nemo_automodel.components.checkpoint._backports.consolidate_hf_safetensors.consolidate_safetensors_files_on_every_rank(
input_dir: str,
output_dir: str,
fqn_to_index_mapping: dict[str, int],
num_threads: int = 1,
process_group: Optional[torch.distributed.ProcessGroup] = None,
) None#

Consolidate sharded safetensors files across multiple ranks, with each rank handling a subset of output files.

This function distributes the consolidation work by assigning output files to different ranks. All tensors with the same index in fqn_to_index_mapping are processed by the same rank, as they belong to the same output file.

If process_group is provided, rank and world_size will be derived from it. Otherwise, they will be automatically detected from the distributed environment if available.

Parameters:
  • input_dir – Directory containing sharded safetensors files

  • output_dir – Directory where consolidated files will be written

  • fqn_to_index_mapping – Mapping of tensor names to output file indices

  • num_threads – Number of threads to use for parallel processing on each rank

  • process_group – PyTorch distributed process group (default: None, will use default group)