For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Overview
    • Quickstart
  • Before You Deploy
    • Infrastructure Sizing
    • Manifest
  • Deployment
    • Installation Overview
    • Image Mirroring
    • Helmfile Installation
  • GPU Cluster Setup
    • GPU Cluster Setup
    • Self-Managed Clusters
  • Configuration
    • Optional Enhancements
    • LLM Function Enablement
    • Gateway Routing
    • Third-Party Registries
    • Registry Allowlist
    • Cluster Configuration
    • KAI Scheduler
  • Using Cloud Functions
    • API
    • Service Keys
    • Function Creation
    • LLM Gateway
    • Generic HTTP Function Invocation
    • gRPC Function Invocation
    • Container Functions
    • Helm Functions
    • Streaming Functions
    • Configure Autoscaling
    • CLI
  • Function Autoscaling
    • Function Autoscaling Overview
    • Architecture
    • Operations
    • Observability
  • Observability
    • Observability
    • Example Dashboards
  • Operations
    • Control Plane Operations
    • Cluster Monitoring
    • Troubleshooting
  • Runbooks
    • Runbooks
    • Key Rotation
  • Reference
    • Cluster Reference
    • gRPC Load Testing
    • gRPC Load Test SLI Guide
    • HTTP Load Testing
    • HTTP Load Test SLI Guide
    • HTTP Soak Testing
  • Development
    • Architecture Overview
      • Local Development
      • Single-cluster (CLI)
      • Single-cluster (Helmfile)
      • Multi-cluster (CLI)
      • Multi-cluster (Helmfile)
    • Fake GPU Operator
    • Release Process
  • Managed (Legacy)
    • Function Lifecycle
    • Observability
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoCloud Functions
On this page
  • Topology
  • Prerequisites
  • Step 1: Bring up the multi-cluster topology
  • Step 2: Author the local secrets file
  • Step 3: Create the image pull secrets
  • Step 4: Install the control plane
  • Step 5: Mint the admin JWT
  • Step 6: Register the compute plane
  • Step 7: Install the compute plane
  • Step 8: Verify
  • Teardown
DevelopmentLocal Development

Multi-cluster Local Development with the CLI

||View as Markdown|
Previous

Single-cluster (Helmfile)

Next

Multi-cluster (Helmfile)

Install a NVCF self-hosted control plane on one local k3d cluster and a separately registered compute plane on a second cluster, all using nvcf-cli. Useful when you want to exercise the multi-cluster install and registration paths before targeting real infrastructure.

This setup is for local development only. It uses fake GPUs, a single Cassandra replica, and ephemeral storage. Do not use this for production workloads.

Topology

k3d clusterRolekubectl context
ncp-local-cpControl planek3d-ncp-local-cp
ncp-local-compute-1Compute plane (first worker)k3d-ncp-local-compute-1

The CLI writes .localhost URLs into the control-plane profile and flows them through to the per-cluster register-values as-is. The NVCA agent on the compute cluster uses those URLs at runtime to reach cp services. The docker network shared between the two k3d clusters (plus the install-time wiring make build-and-deploy-multicluster sets up) is what makes the cross-cluster reach work.

For users coming from the Helmfile install path: that flow is values-driven and uses the .nvcf-control-plane.test aliases provisioned by tools/ncp-local-cluster/scripts/configure-control-plane-endpoints.sh. The CLI path does not depend on those aliases.

Prerequisites

Install the following tools:

  • Docker (running)

  • k3d v5.x or later

  • kubectl

  • helm >= 3.12

  • An NGC API key from ngc.nvidia.com with access to the NVCF chart and image registry.

  • The NGC organization and team slugs that hold the chart and image repository you have access to. make build-and-deploy-multicluster reads these from SAMPLE_NGC_ORG / SAMPLE_NGC_TEAM during its credential provider validation step; without them, the build target fails and skips its final gateway-API setup.

  • nvcf-cli built from this repo:

    $go build -o nvcf-cli ./src/clis/nvcf-cli

Export the env vars used by the cluster bootstrap and the install steps:

$export NGC_API_KEY="<your-ngc-api-key>"
$export SAMPLE_NGC_ORG="<your-ngc-org>"
$export SAMPLE_NGC_TEAM="<your-ngc-team>"

Step 1: Bring up the multi-cluster topology

$make -C tools/ncp-local-cluster build-and-deploy-multicluster

This creates ncp-local-cp plus ncp-local-compute-1, installs the fake GPU operator and CSI SMB driver on the compute cluster, configures DNS for the .test aliases, and validates Envoy Gateway on the control-plane cluster.

The single-cluster (ncp-local) and multi-cluster (ncp-local-cp + ncp-local-compute-N) topologies both claim host ports 8080/8443/4222 and cannot coexist. If you already have the single-cluster topology running:

$make -C tools/ncp-local-cluster destroy CLUSTER_NAME=ncp-local

build-and-deploy-multicluster runs setup-gateway-api, check-gateway-api, and validate-gateway on the control-plane cluster as its final steps. If any earlier step fails (for example, credential provider validation when SAMPLE_NGC_ORG / SAMPLE_NGC_TEAM are not set), gateway setup is skipped. After fixing the underlying issue, re-run just the gateway-API setup on the cp cluster:

$make -C tools/ncp-local-cluster setup-gateway-api CLUSTER_NAME=ncp-local-cp
$make -C tools/ncp-local-cluster check-gateway-api CLUSTER_NAME=ncp-local-cp

Step 2: Author the local secrets file

$cp deploy/stacks/self-managed/secrets/secrets.yaml.template \
> deploy/stacks/self-managed/secrets/local-secrets.yaml
$
$BASE64_CRED=$(echo -n "\$oauthtoken:${NGC_API_KEY}" | base64 -w0)
$sed -i.bak "s|REPLACE_WITH_BASE64_DOCKER_CREDENTIAL|${BASE64_CRED}|g" \
> deploy/stacks/self-managed/secrets/local-secrets.yaml
$rm deploy/stacks/self-managed/secrets/local-secrets.yaml.bak

Step 3: Create the image pull secrets

nvcf-cli self-hosted install renders helmfile manifests that reference imagePullSecrets: [{name: nvcr-pull-secret}]. Create the secret in each NVCF namespace on the control-plane cluster (k3d-ncp-local-cp) before running install so pods can pull images from nvcr.io. Set the kubectl context to the cp cluster first if you have not already:

$kubectl config use-context k3d-ncp-local-cp
$
$for ns in cassandra-system nats-system nvcf api-keys ess sis \
> vault-system nvca-operator nvca-system nvcf-backend cert-manager; do
$ kubectl create namespace "$ns" --dry-run=client -o yaml | kubectl apply -f -
$ kubectl create secret docker-registry nvcr-pull-secret \
> --docker-server=nvcr.io \
> --docker-username='$oauthtoken' \
> --docker-password="${NGC_API_KEY}" \
> --namespace="$ns" \
> --dry-run=client -o yaml | kubectl apply -f -
$done

The loop is idempotent (uses kubectl apply). Pull secrets for the compute cluster (k3d-ncp-local-compute-1) are configured by compute-plane install later in this flow.

Step 4: Install the control plane

The install command needs both contexts so it knows which cluster gets each plane:

$nvcf-cli \
> --config tests/bdd/fixtures/nvcf-cli-local.yaml \
> self-hosted \
> --stack deploy/stacks/self-managed \
> --env local \
> --plain \
> --control-plane-context k3d-ncp-local-cp \
> --compute-plane-context k3d-ncp-local-compute-1 \
> --token DUMMY \
> install --control-plane \
> --cluster-name ncp-local-cp \
> --region us-west-1 \
> --nca-id nvcf-default

--token DUMMY skips the install command’s check-cp auth gate. The install path itself never consumes the token. See the single-cluster CLI page for the full explanation.

When this completes, a control-plane profile is written to deploy/stacks/self-managed/out/control-plane-profile.yaml. It carries both URL blocks:

  • controlPlane.endpoints.inCluster.* - resolves only inside the control-plane cluster (for example http://api.sis.svc.cluster.local:8080).
  • controlPlane.endpoints.computeReachable.* - the .localhost URLs the CLI writes for cluster-external consumers. These flow through to the register-values in Step 6 as-is; compute-plane register does not rewrite them.

compute-plane register picks the right block by inspecting --kube-context against the cp context.

Step 5: Mint the admin JWT

$nvcf-cli \
> --config tests/bdd/fixtures/nvcf-cli-local.yaml \
> init

Step 6: Register the compute plane

The --kube-context flag selects the compute cluster, which causes the CLI to pick the computeReachable URL block from the profile and write those URLs straight into the register-values file. The NVCA agent on the compute cluster uses those URLs at runtime to reach cp services.

$nvcf-cli \
> --config tests/bdd/fixtures/nvcf-cli-local.yaml \
> self-hosted \
> --stack deploy/stacks/self-managed \
> --env local \
> --plain \
> compute-plane register \
> --control-plane-profile deploy/stacks/self-managed/out/control-plane-profile.yaml \
> --cluster-name ncp-local-compute-1 \
> --kube-context k3d-ncp-local-compute-1 \
> --region us-west-1 \
> --output deploy/stacks/self-managed/out/ncp-local-compute-1-register-values.yaml

The output file’s selfManaged block contains the .test hostnames, not the in-cluster service URLs.

nvcf-cli cluster register (run internally during this step) auto-discovers the target cluster’s OIDC issuer and JWKS by running a probe Job in the cluster identified by --kube-context. That identity is what ICMS validates when the compute agent presents PSAT tokens at runtime. Always set --kube-context to the COMPUTE cluster.

Step 7: Install the compute plane

$nvcf-cli \
> --config tests/bdd/fixtures/nvcf-cli-local.yaml \
> self-hosted \
> --stack deploy/stacks/self-managed \
> --env local \
> --plain \
> compute-plane install \
> --values deploy/stacks/self-managed/out/ncp-local-compute-1-register-values.yaml \
> --kube-context k3d-ncp-local-compute-1 \
> --cluster-name ncp-local-compute-1

Step 8: Verify

The NVCFBackend resource is created on the compute cluster, not the control-plane cluster.

$kubectl wait nvcfbackend ncp-local-compute-1 \
> -n nvca-operator \
> --context k3d-ncp-local-compute-1 \
> --for=jsonpath='{.status.agentStatus}'=healthy \
> --timeout=10m

Confirm the control-plane API is reachable (from the host, where api.localhost resolves to 127.0.0.1):

$export NVCF_TOKEN=$(curl -s -X POST "http://api-keys.localhost:8080/v1/admin/keys" \
> | python3 -c "import sys,json; print(json.load(sys.stdin)['value'])")
$
$curl -s "http://api.localhost:8080/v2/nvcf/functions" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" | python3 -m json.tool

Teardown

Remove the helm releases on both clusters but keep the topology (stack-only):

$tests/bdd/scripts/destroy-stack.sh multi

Or destroy the whole topology:

$make -C tools/ncp-local-cluster destroy-multicluster