Generic Deployment#
Generic deployment provides flexible configuration for deploying any custom server that isn’t covered by built-in deployment configurations.
Configuration#
See configs/deployment/generic.yaml
for all available parameters.
Basic Settings#
Key arguments:
image
: Docker image to use for deployment (required)command
: Command to run the server with template variables (required)served_model_name
: Name of the served model (required)endpoints
: API endpoint paths (chat, completions, health)checkpoint_path
: Path to model checkpoint for mounting (default: null)extra_args
: Additional command line argumentsenv_vars
: Environment variables as {name: value} dict
Best Practices#
Ensure server responds to health check endpoint (ensure that health endpoint is correctly parametrized)
Test configuration with
--dry_run
Contributing Permanent Configurations#
If you’ve successfully applied the generic deployment to serve a specific model or framework, contributions are welcome! We’ll turn your working configuration into a permanent config file for the community.