For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Overview
    • Quickstart
  • Before You Deploy
    • Infrastructure Sizing
    • Manifest
  • Deployment
    • Installation Overview
    • Image Mirroring
    • Helmfile Installation
  • GPU Cluster Setup
    • GPU Cluster Setup
    • Self-Managed Clusters
  • Configuration
    • Optional Enhancements
    • LLM Function Enablement
    • Gateway Routing
    • Third-Party Registries
    • Registry Allowlist
    • Cluster Configuration
    • KAI Scheduler
  • Using Cloud Functions
    • API
    • Service Keys
    • Function Creation
    • LLM Gateway
    • Generic HTTP Function Invocation
    • gRPC Function Invocation
    • Container Functions
    • Helm Functions
    • Streaming Functions
    • Configure Autoscaling
    • CLI
  • Function Autoscaling
    • Function Autoscaling Overview
    • Architecture
    • Operations
    • Observability
  • Observability
    • Observability
    • Example Dashboards
  • Operations
    • Control Plane Operations
    • Cluster Monitoring
    • Troubleshooting
  • Runbooks
    • Runbooks
    • Key Rotation
  • Reference
    • Cluster Reference
    • gRPC Load Testing
    • gRPC Load Test SLI Guide
    • HTTP Load Testing
    • HTTP Load Test SLI Guide
    • HTTP Soak Testing
  • Development
    • Architecture Overview
    • Fake GPU Operator
    • Release Process
  • Managed (Legacy)
    • Function Lifecycle
    • Observability
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoCloud Functions
Configuration

KAI Scheduler Integration Guide

||View as Markdown|
Previous

Cluster Configuration

Next

LLS Installation

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale. To use the KAI Scheduler for NVCF Workloads the following configuration should be applied post the installation of the KAI Scheduler in the cluster and the Optimized AI Workload Scheduling enabled on the cluster. NVCF Workloads deployed will be automatically BinPacked upon this cluster configuration changes.

KAI Scheduler Installation

Upgrade to latest KAI Scheduler release is recommended to get latest fixes and security patches

NVCA’s KAI scheduler integration expects default queues to exist with names default-parent-queue (parent) and default-queue (child); other queues may exist in the cluster.

One caveat is that NVCA expects all queues used to create NVCF workloads to have unlimited (-1) quotas and limits to ensure full cluster capacity utilization and accurate usage tracking. If the cluster is partitioned to serve both NVCF and non-NVCF workloads and KAI scheduler queue quotas/limits are limited to reflect this, then Shared Cluster mode must be enabled so non-NVCF workload nodes are accurately excluded from tracking and scheduling by NVCA.

Create values.yaml with default queue attributes:

kai-scheduler-queues.yaml
kai-scheduler-queues.yaml
1defaultQueue:
2 createDefaultQueue: true
3 parentName: default-parent-queue
4 childName: default-queue
5 parentResources:
6 cpu:
7 quota: -1
8 limit: -1
9 overQuotaWeight: 1
10 gpu:
11 quota: -1
12 limit: -1
13 overQuotaWeight: 1
14 memory:
15 quota: -1
16 limit: -1
17 overQuotaWeight: 1
18 childResources:
19 cpu:
20 quota: -1
21 limit: -1
22 overQuotaWeight: 1
23 gpu:
24 quota: -1
25 limit: -1
26 overQuotaWeight: 1
27 memory:
28 quota: -1
29 limit: -1
30 overQuotaWeight: 1
$helm install kai-scheduler oci://ghcr.io/kai-scheduler/kai-scheduler/kai-scheduler -f values.yaml -n kai-scheduler --create-namespace --version v0.12.6