nemo_export.tensorrt_llm_deployable_ray#
TensorRT-LLM Ray deployment functionality has been removed.
This module now only contains placeholder functions that raise NotImplementedError. TensorRT-LLM deployment support has been deprecated and removed from this codebase.
Module Contents#
Classes#
Placeholder class for TensorRT-LLM Ray deployment functionality. |
Data#
API#
- nemo_export.tensorrt_llm_deployable_ray.LOGGER = 'getLogger(...)'#
- class nemo_export.tensorrt_llm_deployable_ray.TensorRTLLMRayDeployable(
- trt_llm_path: str,
- model_id: str = 'tensorrt-llm-model',
- use_python_runtime: bool = True,
- enable_chunked_context: bool = None,
- max_tokens_in_paged_kv_cache: int = None,
- multi_block_mode: bool = False,
- lora_ckpt_list: List[str] = None,
Placeholder class for TensorRT-LLM Ray deployment functionality.
Note: TensorRT-LLM deployment support has been removed from this codebase. All methods will raise NotImplementedError.
Initialization
Initialize the TensorRT-LLM model deployment.
- Raises:
NotImplementedError – This functionality has been removed.
- abstractmethod generate(*args, **kwargs)[source]#
Generate method.
- Raises:
NotImplementedError – This functionality has been removed.
- abstractmethod chat_completions(*args, **kwargs)[source]#
Chat completions method.
- Raises:
NotImplementedError – This functionality has been removed.