Deployment Configuration#

Deployment configurations define how to provision and host model endpoints for evaluation.

Note

For an overview of all deployment strategies and when to use launcher-orchestrated vs. bring-your-own-endpoint approaches, see Serve and Deploy Models.

Deployment Types#

Choose the deployment type for your evaluation:

None (External)

Use existing API endpoints. No model deployment needed.

None Deployment

vLLM

Deploy models using the vLLM serving framework.

vLLM Deployment

SGLang

Deploy models using the SGLang serving framework.

SGLang Deployment

NIM

Deploy models using NVIDIA Inference Microservices.

NIM Deployment

Quick Reference#

deployment:
  type: vllm  # or sglang, nim, none
  # ... deployment-specific settings