nemo_export.vllm_hf_exporter
#
Module Contents#
Classes#
The Exporter class uses vLLM APIs to convert a HF model to vLLM and makes the class, deployable with Triton server. |
API#
- class nemo_export.vllm_hf_exporter.vLLMHFExporter[source]#
Bases:
nemo_deploy.ITritonDeployable
The Exporter class uses vLLM APIs to convert a HF model to vLLM and makes the class, deployable with Triton server.
.. rubric:: Example
from nemo_export import vLLMHFExporter from nemo_deploy import DeployPyTriton
exporter = vLLMHFExporter() exporter.export(model=β/path/to/model/β)
server = DeployPyTriton( model=exporter, triton_model_name=βmodelβ )
server.deploy() server.serve() server.stop()
Initialization
- export(model, enable_lora: bool = False)[source]#
Exports the HF checkpoint to vLLM and initializes the engine.
- Parameters:
model (str) β model name or the path
- property get_triton_input#
- property get_triton_output#