nemo_export.trt_llm.utils#

Module Contents#

Functions#

is_rank

Check if the current MPI rank matches the specified rank.

determine_quantization_settings

Determines the exported models quantization settings. Reads from NeMo config, with optional override.

API#

nemo_export.trt_llm.utils.is_rank(rank: Optional[int]) bool[source]#

Check if the current MPI rank matches the specified rank.

Parameters:

rank (Optional[int]) – The rank to check against.

Returns:

True if the current rank matches the specified rank or if rank is None.

Return type:

bool

nemo_export.trt_llm.utils.determine_quantization_settings(
nemo_model_config: Dict[str, Any],
fp8_quantized: Optional[bool] = None,
fp8_kvcache: Optional[bool] = None,
) Tuple[bool, bool][source]#

Determines the exported models quantization settings. Reads from NeMo config, with optional override.

Parameters:
  • nemo_model_config (dict) – NeMo model configuration

  • fp8_quantized (optional, bool) – User-specified quantization flag

  • fp8_kvcache (optional, bool) – User-specified cache quantization flag

Returns:

  • Model quantization flag

  • Model kv-cache quantization flag

Return type:

Tuple[bool, bool]