For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Overview
    • Quickstart
  • Before You Deploy
    • Infrastructure Sizing
    • Manifest
  • Deployment
    • Installation Overview
    • Image Mirroring
    • Helmfile Installation
  • GPU Cluster Setup
    • GPU Cluster Setup
    • Self-Managed Clusters
  • Configuration
    • Optional Enhancements
    • LLM Function Enablement
    • Gateway Routing
    • Third-Party Registries
    • Registry Allowlist
    • Cluster Configuration
    • KAI Scheduler
  • Using Cloud Functions
    • API
    • Service Keys
    • Function Creation
    • LLM Gateway
    • Generic HTTP Function Invocation
    • gRPC Function Invocation
    • Container Functions
    • Helm Functions
    • Streaming Functions
    • Configure Autoscaling
    • CLI
  • Function Autoscaling
    • Function Autoscaling Overview
    • Architecture
    • Operations
    • Observability
  • Observability
    • Observability
    • Example Dashboards
  • Operations
    • Control Plane Operations
    • Cluster Monitoring
    • Troubleshooting
  • Runbooks
    • Runbooks
    • Key Rotation
  • Reference
    • Cluster Reference
    • gRPC Load Testing
    • gRPC Load Test SLI Guide
    • HTTP Load Testing
    • HTTP Load Test SLI Guide
    • HTTP Soak Testing
  • Development
    • Architecture Overview
      • Local Development
      • Single-cluster (CLI)
      • Single-cluster (Helmfile)
      • Multi-cluster (CLI)
      • Multi-cluster (Helmfile)
    • Fake GPU Operator
    • Release Process
  • Managed (Legacy)
    • Function Lifecycle
    • Observability
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoCloud Functions
On this page
  • Topology
  • Prerequisites
  • Step 1: Bring up the multi-cluster topology
  • Step 2: Author the multi-cluster Helmfile environment file
  • Step 3: Author the secrets file
  • Step 4: Set kubectl context to the control-plane cluster
  • Step 5: Pre-create the image pull secret in NVCF namespaces (cp cluster)
  • Step 6: Install the control plane
  • Step 7: Switch kubectl context to the compute cluster (CRITICAL)
  • Step 8: Pre-create the image pull secret on the compute cluster
  • Step 9: Register the compute cluster
  • Step 10: Install the NVCA operator on the compute cluster
  • Step 11: Verify
  • Teardown
DevelopmentLocal Development

Multi-cluster Local Development with Helmfile

||View as Markdown|
Previous

Multi-cluster (CLI)

Next

Fake GPU Operator

Install the NVCF self-hosted control plane on one local k3d cluster and the NVCA operator on a separately registered compute cluster, all driven by the documented Helmfile workflow.

This setup is for local development only. It uses fake GPUs, a single Cassandra replica, and ephemeral storage. Do not use this for production workloads.

Topology

k3d clusterRolekubectl context
ncp-local-cpControl planek3d-ncp-local-cp
ncp-local-compute-1Compute plane (first worker)k3d-ncp-local-compute-1

Cross-cluster traffic from the compute cluster reaches the control-plane load balancer via .test host aliases that tools/ncp-local-cluster/scripts/configure-control-plane-endpoints.sh provisions:

  • http://sis.nvcf-control-plane.test:8080
  • http://reval.nvcf-control-plane.test:8080
  • nats://nats.nvcf-control-plane.test:4222

Prerequisites

Install the following tools:

  • Docker (running)

  • k3d v5.x or later

  • kubectl

  • helm >= 3.12

  • helmfile >= 1.1.0, < 1.2.0

  • helm-diff plugin: helm plugin install https://github.com/databus23/helm-diff

  • An NGC API key from ngc.nvidia.com with access to the NVCF chart and image registry.

  • The NGC organization and team slugs that hold the chart/image repository you have access to.

  • nvcf-cli built from this repo. Steps 9 and 10 pass NVCF_CLI=$(pwd)/nvcf-cli to the make targets, so the binary must exist on disk before those steps run:

    $go build -o nvcf-cli ./src/clis/nvcf-cli

Export the env vars used below:

$export NGC_API_KEY="<your-ngc-api-key>"
$export SAMPLE_NGC_ORG="<your-ngc-org>"
$export SAMPLE_NGC_TEAM="<your-ngc-team>"

Step 1: Bring up the multi-cluster topology

$make -C tools/ncp-local-cluster build-and-deploy-multicluster

The single-cluster (ncp-local) and multi-cluster (ncp-local-cp + ncp-local-compute-N) topologies both claim host ports 8080/8443/4222 and cannot coexist. If you already have the single-cluster topology running:

$make -C tools/ncp-local-cluster destroy CLUSTER_NAME=ncp-local

Step 2: Author the multi-cluster Helmfile environment file

The values-driven Helmfile path has no control-plane profile; the operator must author topology-correct URLs in the environment file. Use the multi-cluster fixture (NOT the single-cluster one):

$cp tests/bdd/fixtures/self-managed-local-bdd-multi.yaml \
> deploy/stacks/self-managed/environments/local-bdd.yaml

Substitute your NGC org and team:

$sed -i.bak \
> -e "s|REPLACE_WITH_SAMPLE_NGC_ORG|${SAMPLE_NGC_ORG}|g" \
> -e "s|REPLACE_WITH_SAMPLE_NGC_TEAM|${SAMPLE_NGC_TEAM}|g" \
> deploy/stacks/self-managed/environments/local-bdd.yaml
$rm deploy/stacks/self-managed/environments/local-bdd.yaml.bak

The multi-cluster fixture’s global.nvcaOperator.selfManaged.* URLs use .test hostnames. The single-cluster fixture’s in-cluster DNS (for example http://api.sis.svc.cluster.local:8080) would resolve only inside the control-plane cluster and the NVCA agent on the compute cluster would 401 against ICMS at runtime. Use the right fixture.

Step 3: Author the secrets file

$cp deploy/stacks/self-managed/secrets/secrets.yaml.template \
> deploy/stacks/self-managed/secrets/local-bdd-secrets.yaml
$
$BASE64_CRED=$(echo -n "\$oauthtoken:${NGC_API_KEY}" | base64 -w0)
$sed -i.bak "s|REPLACE_WITH_BASE64_DOCKER_CREDENTIAL|${BASE64_CRED}|g" \
> deploy/stacks/self-managed/secrets/local-bdd-secrets.yaml
$rm deploy/stacks/self-managed/secrets/local-bdd-secrets.yaml.bak

Step 4: Set kubectl context to the control-plane cluster

Helmfile install runs against the ambient kubectl context. Switch to the control-plane cluster so the install lands there:

$kubectl config use-context k3d-ncp-local-cp

Step 5: Pre-create the image pull secret in NVCF namespaces (cp cluster)

$for ns in cassandra-system nats-system nvcf api-keys ess sis \
> vault-system nvca-operator nvca-system nvcf-backend cert-manager; do
$ kubectl create namespace "$ns" --dry-run=client -o yaml | kubectl apply -f -
$ kubectl create secret docker-registry nvcr-pull-secret \
> --docker-server=nvcr.io \
> --docker-username='$oauthtoken' \
> --docker-password="${NGC_API_KEY}" \
> --namespace="$ns" \
> --dry-run=client -o yaml | kubectl apply -f -
$done

Step 6: Install the control plane

$make -C deploy/stacks/self-managed install HELMFILE_ENV=local-bdd

The 18 standard helm releases land on k3d-ncp-local-cp (see the single-cluster Helmfile page for the full release list).

Step 7: Switch kubectl context to the compute cluster (CRITICAL)

$kubectl config use-context k3d-ncp-local-compute-1

This single context switch is the most error-prone step in the multi-cluster flow. The next step’s nvcf-cli cluster register (run internally by make register-cluster) auto-discovers the target cluster’s OIDC issuer and JWKS by running a probe Job in the CURRENT kubectl context. If you skip the switch, the control-plane cluster’s JWKS gets registered as the compute cluster’s identity, and the compute agent’s PSAT tokens 401 against ICMS at runtime.

Step 8: Pre-create the image pull secret on the compute cluster

$for ns in nvca-operator nvca-system nvcf-backend; do
$ kubectl create namespace "$ns" --dry-run=client -o yaml | kubectl apply -f -
$ kubectl create secret docker-registry nvcr-pull-secret \
> --docker-server=nvcr.io \
> --docker-username='$oauthtoken' \
> --docker-password="${NGC_API_KEY}" \
> --namespace="$ns" \
> --dry-run=client -o yaml | kubectl apply -f -
$done

Step 9: Register the compute cluster

$make -C deploy/stacks/self-managed register-cluster \
> CLUSTER_NAME=ncp-local-compute-1 \
> NVCF_CLI=$(pwd)/nvcf-cli \
> NVCF_CLI_CONFIG=$(pwd)/tests/bdd/fixtures/nvcf-cli-local.yaml

make register-cluster runs nvcf-cli init internally before cluster register, so this flow does not need a separate init step.

The target produces deploy/stacks/self-managed/out/ncp-local-compute-1-register-values.yaml.

Step 10: Install the NVCA operator on the compute cluster

$make -C deploy/stacks/self-managed install-nvca-operator \
> CLUSTER_NAME=ncp-local-compute-1 \
> HELMFILE_ENV=local-bdd \
> NVCF_CLI=$(pwd)/nvcf-cli \
> NVCF_CLI_CONFIG=$(pwd)/tests/bdd/fixtures/nvcf-cli-local.yaml

Step 11: Verify

The NVCFBackend resource is created on the compute cluster, not the control-plane cluster. Use the compute cluster context for all verification:

$kubectl rollout status deployment/nvca-operator \
> -n nvca-operator \
> --context k3d-ncp-local-compute-1 \
> --timeout=10m
$
$kubectl wait nvcfbackend ncp-local-compute-1 \
> -n nvca-operator \
> --context k3d-ncp-local-compute-1 \
> --for=jsonpath='{.status.agentStatus}'=healthy \
> --timeout=10m

Confirm the control-plane API is reachable (from the host):

$export NVCF_TOKEN=$(curl -s -X POST "http://api-keys.localhost:8080/v1/admin/keys" \
> | python3 -c "import sys,json; print(json.load(sys.stdin)['value'])")
$
$curl -s "http://api.localhost:8080/v2/nvcf/functions" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" | python3 -m json.tool

Teardown

Remove the helm releases on both clusters but keep the topology (stack-only):

$tests/bdd/scripts/destroy-stack.sh multi

Or destroy the whole topology:

$make -C tools/ncp-local-cluster destroy-multicluster