Quickstart: Local k3d Installation

View as Markdown

Install a self-hosted NVCF stack on one local k3d cluster, register that cluster with the control plane, and confirm that the local deployment is healthy.

This quickstart uses a single k3d cluster named ncp-local, fake GPUs, and local route hostnames. It is for local development and validation only. For a remote deployment, or for separate control-plane and GPU clusters, use Helmfile Installation and Self-Managed Clusters.

Run the commands from the NVCF repository root unless a step says otherwise. The nvcf-cli self-hosted up command runs on your workstation. It does not run inside Kubernetes.

Prerequisites

Before you start, install and prepare:

  • Docker running on your workstation
  • k3d v5.x or later
  • kubectl
  • helm >= 3.14
  • helmfile >= 1.0. Use helmfile >= 1.5.0 with Helm 4.
  • helm-diff plugin
  • nvcf-cli on your PATH. See Installation to build it from the repository or download it from NGC.
  • An NGC API key with access to the NVCF chart and image registry
  • The NGC organization and team slugs for that registry access

Use these install helpers if you do not already have the local tools.

Install Docker from the Docker installation guide. After Docker starts, verify that the CLI can reach it:

$docker version

On macOS with Homebrew:

$brew install k3d kubectl helm helmfile
$
$k3d version
$kubectl version --client
$helm version
$helmfile version

For other systems, use the official installation guides: k3d, kubectl, Helm, and Helmfile.

Install the Helm plugin used by Helmfile:

$if helm version --short | grep -q '^v4'; then
$ helm plugin install https://github.com/databus23/helm-diff --verify=false
$else
$ helm plugin install https://github.com/databus23/helm-diff
$fi
$
$helm plugin list

See the helm-diff installation instructions for offline or Helm 4 installation options.

Run these commands from the NVCF repository root:

$./setup.sh
$bazel build //src/clis/nvcf-cli:nvcf-cli
$
$mkdir -p "${HOME}/.local/bin"
$install -m 0755 \
> bazel-bin/src/clis/nvcf-cli/nvcf-cli_/nvcf-cli \
> "${HOME}/.local/bin/nvcf-cli"
$
$export PATH="${HOME}/.local/bin:${PATH}"
$nvcf-cli version

For the packaged CLI release, see Installation.

self-hosted up defaults to --env local and supports only the single local k3d layout. It requires a current k3d-* kube context.

Step 1: Create the local k3d cluster

Export the registry credentials used by the local cluster bootstrap:

$export NGC_API_KEY="<ngc-api-key>"
$export SAMPLE_NGC_ORG="<ngc-org>"
$export SAMPLE_NGC_TEAM="<ngc-team>"

Create the single-cluster local topology:

$make -C tools/ncp-local-cluster build-and-deploy-cluster
$kubectl config use-context k3d-ncp-local
$kubectl config current-context

Expected output:

k3d-ncp-local

This creates a k3d cluster named ncp-local, installs the fake GPU operator, the CSI SMB driver, and Envoy Gateway, and makes the local kube context available to kubectl and nvcf-cli.

Step 2: Prepare local routing

Add these entries to /etc/hosts on the workstation running the CLI if they do not already resolve:

127.0.0.1 api.localhost
127.0.0.1 api-keys.localhost
127.0.0.1 sis.localhost
127.0.0.1 invocation.localhost

Use the local CLI configuration that points commands at the local routes:

$export NVCF_CLI_CONFIG=deploy/stacks/self-managed/nvcf-cli-local.yaml

Log in to the NGC container registry:

$helm registry login nvcr.io -u '$oauthtoken' -p "${NGC_API_KEY}"

The local hostnames route CLI traffic through the local Envoy Gateway. The local CLI configuration keeps later function commands on those local routes instead of the hosted NVCF endpoints.

Step 3: Create the local stack secrets file

Create the local secrets file used by the self-managed stack:

$cp deploy/stacks/self-managed/secrets/secrets.yaml.template \
> deploy/stacks/self-managed/secrets/local-secrets.yaml
$
$BASE64_CRED="$(printf '%s' "\$oauthtoken:${NGC_API_KEY}" | base64 | tr -d '\n')"
$sed -i.bak "s|REPLACE_WITH_BASE64_DOCKER_CREDENTIAL|${BASE64_CRED}|g" \
> deploy/stacks/self-managed/secrets/local-secrets.yaml
$rm deploy/stacks/self-managed/secrets/local-secrets.yaml.bak

local-secrets.yaml is gitignored. Keep your NGC key out of committed files.

Step 4: Run the install

Check local tools and Kubernetes access:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted check --pre

Run the local install:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted up \
> --env=local \
> --cluster-name=ncp-local \
> --nca-id=nvcf-default \
> --region=us-west-1 \
> --icms-url=http://sis.localhost:8080 \
> --stack=deploy/stacks/self-managed \
> --refresh-token

Expected result: the final screen reports a successful install, a registered cluster, and a healthy backend.

The command installs the control plane, mints CLI authentication, registers ncp-local, installs the compute-plane components, and waits for the NVCFBackend health check to report healthy.

Step 5: Verify the install

Run the full self-hosted health checks:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted check --all \
> --cluster-name=ncp-local

Inspect the local Kubernetes resources:

$kubectl get pods -A
$kubectl get httproute -A
$kubectl get nvcfbackends -A

Expected result: the CLI checks do not report failed checks, and the ncp-local NVCFBackend reports healthy.

Step 6: Deploy and invoke a sample function

Create a function from the load_tester_supreme sample image in the registry organization and team you exported earlier:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" function create \
> --name quickstart-load-tester-supreme \
> --image "nvcr.io/${SAMPLE_NGC_ORG}/${SAMPLE_NGC_TEAM}/load_tester_supreme:0.0.8" \
> --inference-url /echo \
> --inference-port 8000 \
> --health-uri /health \
> --health-port 8000 \
> --health-timeout PT30S

Deploy the function to the local fake GPU backend:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" function deploy create \
> --gpu H100 \
> --instance-type NCP.GPU.H100_8x \
> --backend ncp-local \
> --regions us-west-1 \
> --min-instances 1 \
> --max-instances 1 \
> --timeout 900

Generate an API key for invocation:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" api-key generate \
> --description quickstart-load-tester-supreme \
> --scopes invoke_function,list_functions,queue_details,list_functions_details

Invoke the sample function:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" function invoke \
> --request-body '{"message":"quickstart-echo","repeats":1}' \
> --timeout 120 \
> --poll-duration 5

Expected result: the invocation response contains quickstart-echo.

For other local fake GPU configurations, choose a --gpu and --instance-type that match the discovered node labels and GPU count.

Clean up

Remove the sample function deployment and function:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" function deploy remove
$nvcf-cli --config "${NVCF_CLI_CONFIG}" function delete

Remove the compute-plane components:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted uninstall \
> --compute-plane \
> --cluster-name ncp-local

Remove the control plane:

$nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted uninstall --control-plane

Destroy the local k3d cluster:

$make -C tools/ncp-local-cluster destroy

Troubleshooting

If the quickstart fails, start with Troubleshooting.

Common local k3d issues:

  • sis.localhost must resolve from the workstation running nvcf-cli.

  • kubectl config current-context must print k3d-ncp-local.

  • lookup api-keys.nvcf.nvidia.com: no such host means the CLI is using the hosted default endpoint. Run commands with --config "${NVCF_CLI_CONFIG}" from this quickstart.

  • node inotify limits below NVCA minimums means the local k3d nodes need higher Linux fs.inotify limits. This does not change your macOS shell limits. From the repository root, apply tools/ncp-local-cluster/apps/node-tuning/node-tuning.yaml, wait for the node-tuning DaemonSet in the kube-system namespace, then rerun nvcf-cli --config "${NVCF_CLI_CONFIG}" self-hosted check --pre:

    $kubectl apply -f tools/ncp-local-cluster/apps/node-tuning/node-tuning.yaml
    $kubectl -n kube-system rollout status ds/node-tuning --timeout=5m

    For non-local clusters, see Node inotify limits.

See Also