`nemo_rl.modelopt.models.policy.workers.utils`#

Module Contents#

Classes#

_DictDataset

Functions#

`symlink_pre_quantized_model`	Symlink an external pre-quantized checkpoint as `<pretrained_path>/iter_0000000`.
`get_tokenizer`	Returns a tokenizer configured for ModelOpt calibration.
`get_forward_loop_func`	Gets the forward loop function for the model.
`quantize_model`	Quantizes the model with the provided calibration dataset.
`get_modelopt_checkpoint_dir`	Gets the default modelopt checkpoint directory.
`quantization_layer_spec`	Layer specification for quantization with ModelOpt.

Data#

`MAX_SEQ_LEN`
`MAX_OUTPUT_LEN`

API#

nemo_rl.modelopt.models.policy.workers.utils.MAX_SEQ_LEN#: 2048

nemo_rl.modelopt.models.policy.workers.utils.MAX_OUTPUT_LEN#: 512

nemo_rl.modelopt.models.policy.workers.utils.symlink_pre_quantized_model(src: str, pretrained_path: str) → None#: Symlink an external pre-quantized checkpoint as <pretrained_path>/iter_0000000.

nemo_rl.modelopt.models.policy.workers.utils.get_tokenizer(ckpt_path, max_seq_len=MAX_SEQ_LEN)#

Returns a tokenizer configured for ModelOpt calibration.

Wraps :func:nemo_rl.algorithms.utils.get_tokenizer and applies the extra configuration needed for batched calibration forward passes: padding_side="left" and model_max_length truncation.

class nemo_rl.modelopt.models.policy.workers.utils._DictDataset(data)#

Bases: torch.utils.data.Dataset

Initialization

__getitem__(idx)#

__len__()#

nemo_rl.modelopt.models.policy.workers.utils.get_forward_loop_func( is_megatron: bool, calib_dataloader: torch.utils.data.DataLoader, )#: Gets the forward loop function for the model.

nemo_rl.modelopt.models.policy.workers.utils.quantize_model( model: torch.nn.Module, quant_cfg: str, tokenizer, calib_size, is_megatron: bool = False, batch_size=32, data='cnn_dailymail', max_sample_length=1024, )#

Quantizes the model with the provided calibration dataset.

Parameters:

model – the model to be quantized.
quant_cfg – the quantization algorithm config name if simple quantization is used. the list of quantization algorithm config names if auto quantization is used.
tokenizer – the tokenizer.
batch_size – the calibration batch size for each calibration inference run.
calib_size – the total calibration dataset size.
auto_quantize_bits – The effective bits constraint for auto_quantize.
data – the name of the calibration dataset.

nemo_rl.modelopt.models.policy.workers.utils.get_modelopt_checkpoint_dir() → str#

Gets the default modelopt checkpoint directory.

Use NRL_MODELOPT_CHECKPOINT_DIR environment variable if set.
Use HF_HOME/nemo_rl if HF_HOME is set.
Use ~/.cache/huggingface/nemo_rl if neither are set.

nemo_rl.modelopt.models.policy.workers.utils.quantization_layer_spec(config)#

Layer specification for quantization with ModelOpt.

We need to disable arbitrary attention mask for sequence packing.

nemo_rl.modelopt.models.policy.workers.utils#