Deployment Configuration#

Deployment configurations define how to provision and host model endpoints for evaluation.

Deployment Types#

Choose the deployment type for your evaluation:

None (External)

Use existing API endpoints. No model deployment needed.

None Deployment
vLLM

Deploy models using the vLLM serving framework.

vLLM Deployment
SGLang

Deploy models using the SGLang serving framework.

SGLang Deployment
NIM

Deploy models using NVIDIA Inference Microservices.

NIM Deployment
TRT-LLM

Deploy models using NVIDIA TensorRT LLM.

TensorRT LLM (TRT-LLM) Deployment
Generic

Deploy models using a fully custom setup.

Generic Deployment

Quick Reference#

deployment:
  type: vllm  # or sglang, nim, none
  # ... deployment-specific settings