nemo_rl.modelopt.models.policy.workers.utils#

Module Contents#

Classes#

Functions#

symlink_pre_quantized_model

Symlink an external pre-quantized checkpoint as <pretrained_path>/iter_0000000.

get_tokenizer

Returns a tokenizer configured for ModelOpt calibration.

get_forward_loop_func

Gets the forward loop function for the model.

quantize_model

Quantizes the model with the provided calibration dataset.

get_modelopt_checkpoint_dir

Gets the default modelopt checkpoint directory.

quantization_layer_spec

Layer specification for quantization with ModelOpt.

Data#

API#

nemo_rl.modelopt.models.policy.workers.utils.MAX_SEQ_LEN#

2048

nemo_rl.modelopt.models.policy.workers.utils.MAX_OUTPUT_LEN#

512

Symlink an external pre-quantized checkpoint as <pretrained_path>/iter_0000000.

nemo_rl.modelopt.models.policy.workers.utils.get_tokenizer(ckpt_path, max_seq_len=MAX_SEQ_LEN)#

Returns a tokenizer configured for ModelOpt calibration.

Wraps :func:nemo_rl.algorithms.utils.get_tokenizer and applies the extra configuration needed for batched calibration forward passes: padding_side="left" and model_max_length truncation.

class nemo_rl.modelopt.models.policy.workers.utils._DictDataset(data)#

Bases: torch.utils.data.Dataset

Initialization

__getitem__(idx)#
__len__()#
nemo_rl.modelopt.models.policy.workers.utils.get_forward_loop_func(
is_megatron: bool,
calib_dataloader: torch.utils.data.DataLoader,
)#

Gets the forward loop function for the model.

nemo_rl.modelopt.models.policy.workers.utils.quantize_model(
model: torch.nn.Module,
quant_cfg: str,
tokenizer,
calib_size,
is_megatron: bool = False,
batch_size=32,
data='cnn_dailymail',
max_sample_length=1024,
)#

Quantizes the model with the provided calibration dataset.

Parameters:
  • model – the model to be quantized.

  • quant_cfg – the quantization algorithm config name if simple quantization is used. the list of quantization algorithm config names if auto quantization is used.

  • tokenizer – the tokenizer.

  • batch_size – the calibration batch size for each calibration inference run.

  • calib_size – the total calibration dataset size.

  • auto_quantize_bits – The effective bits constraint for auto_quantize.

  • data – the name of the calibration dataset.

nemo_rl.modelopt.models.policy.workers.utils.get_modelopt_checkpoint_dir() str#

Gets the default modelopt checkpoint directory.

  1. Use NRL_MODELOPT_CHECKPOINT_DIR environment variable if set.

  2. Use HF_HOME/nemo_rl if HF_HOME is set.

  3. Use ~/.cache/huggingface/nemo_rl if neither are set.

nemo_rl.modelopt.models.policy.workers.utils.quantization_layer_spec(config)#

Layer specification for quantization with ModelOpt.

We need to disable arbitrary attention mask for sequence packing.