nemo_export.trt_llm.utils
#
Module Contents#
Functions#
Check if the current MPI rank matches the specified rank. |
|
Determines the exported models quantization settings. Reads from NeMo config, with optional override. |
API#
- nemo_export.trt_llm.utils.is_rank(rank: Optional[int]) bool [source]#
Check if the current MPI rank matches the specified rank.
- Parameters:
rank (Optional[int]) – The rank to check against.
- Returns:
True if the current rank matches the specified rank or if rank is None.
- Return type:
bool
- nemo_export.trt_llm.utils.determine_quantization_settings(
- nemo_model_config: Dict[str, Any],
- fp8_quantized: Optional[bool] = None,
- fp8_kvcache: Optional[bool] = None,
Determines the exported models quantization settings. Reads from NeMo config, with optional override.
- Parameters:
nemo_model_config (dict) – NeMo model configuration
fp8_quantized (optional, bool) – User-specified quantization flag
fp8_kvcache (optional, bool) – User-specified cache quantization flag
- Returns:
Model quantization flag
Model kv-cache quantization flag
- Return type:
Tuple[bool, bool]