`nemo_export.trt_llm.utils`#

Module Contents#

Functions#

`is_rank`	Check if the current MPI rank matches the specified rank.
`determine_quantization_settings`	Determines the exported models quantization settings. Reads from NeMo config, with optional override.

API#

nemo_export.trt_llm.utils.is_rank(rank: Optional[int]) → bool#

Check if the current MPI rank matches the specified rank.

Parameters:: rank (Optional[int]) – The rank to check against.
Returns:: True if the current rank matches the specified rank or if rank is None.
Return type:: bool

nemo_export.trt_llm.utils.determine_quantization_settings( nemo_model_config: Dict[str, Any], fp8_quantized: Optional[bool] = None, fp8_kvcache: Optional[bool] = None, ) → Tuple[bool, bool]#

Determines the exported models quantization settings. Reads from NeMo config, with optional override.

Parameters:

nemo_model_config (dict) – NeMo model configuration
fp8_quantized (optional, bool) – User-specified quantization flag
fp8_kvcache (optional, bool) – User-specified cache quantization flag

Returns:

Model quantization flag
Model kv-cache quantization flag

Return type:

Tuple[bool, bool]

nemo_export.trt_llm.utils#

Module Contents#

Functions#

API#

`nemo_export.trt_llm.utils`#