The function autoscaler is a Rust service deployed as a horizontally scaled Kubernetes Deployment. It reads utilization and request metrics from a Prometheus-compatible timeseries database, stores discovered functions and coordination state in Cassandra, and writes desired instance counts to the NVCF API. Cassandra lightweight transactions handle leader election and short-lived per-function locks.
The work is split into two loops. A leader-elected discovery loop scans the timeseries database for active function versions and upserts them into Cassandra. A scaling loop runs on every replica, but each replica only handles the functions whose IDs hash into its assigned buckets, so the active set is sharded across replicas.
The discovery loop runs on one leader-elected replica. The scaling loop runs on every replica, but each replica only processes the function buckets assigned to it.
The function autoscaler is a read-only client of a Prometheus-compatible timeseries store. It calls the /api/v1/query_range HTTP endpoint and uses PromQL for every metric query, so any backend that implements that interface works: upstream Prometheus, Thanos, Grafana Mimir, or VictoriaMetrics. The reference NVCF deployments point at VictoriaMetrics via the timeseries_db_url setting.
The function autoscaler does not run a scrape config of its own and does not write samples. Before it can do anything useful, the rest of the data plane has to be feeding the same store:
nvcf_worker_service_worker_thread_busy_seconds_total, nvcf_worker_service_worker_thread_count_total, instance gauges).function_request, function_request_total) labeled by function_id, function_version_id, and nca_id. These labels are how the discovery loop finds active function versions.For a self-hosted control plane, you need three things in place before bringing the function autoscaler online:
timeseries_db_url. The function autoscaler reports not ready on its readiness probe until that endpoint responds.Coordination relies on Cassandra TTLs to recover from failures without operator intervention: