Phase 3: Gateway and Ingress

View as Markdown

This phase installs the ingress layer that exposes NVCF services externally. It consists of two parts: the Envoy Gateway infrastructure and the NVCF Gateway Routes chart that creates HTTPRoutes and TCPRoutes for each service.

All core services from standalone-core-services must be running before proceeding. The Gateway Routes chart depends on the Notary Service and API Keys being available.

Install Kubernetes Gateway CRDs

Install the Kubernetes Gateway API CRDs if not already present on your cluster:

$kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/experimental-install.yaml

If replacing the version (v1.2.0), ensure compatibility with the GatewayClass and Gateway resources created below.

Install Envoy Gateway

Install Envoy Gateway as the Gateway API controller:

$helm install eg oci://docker.io/envoyproxy/gateway-helm \
> --version v1.1.3 \
> -n envoy-gateway-system

Verify the Envoy Gateway pods are running:

$kubectl get pods -n envoy-gateway-system
$
$# Expected: envoy-gateway pod Running

Create GatewayClass

Create the GatewayClass resource that binds to the Envoy Gateway controller:

$kubectl apply -f - <<EOF
$apiVersion: gateway.networking.k8s.io/v1
$kind: GatewayClass
$metadata:
$ name: eg
$spec:
$ controllerName: gateway.envoyproxy.io/gatewayclass-controller
$EOF

Create Gateway

Create the Gateway resource that provisions the external load balancer.

The annotations section is cloud-provider specific and controls how the external load balancer is provisioned:

  • AWS (EKS): Creates an internet-facing Network Load Balancer
  • GCP (GKE): Creates an external HTTP(S) load balancer
  • Azure (AKS): Creates a public load balancer
  • On-prem: Requires a load balancer solution like MetalLB, or use NodePort. Consult your infrastructure documentation.
$kubectl apply -f - <<EOF
$apiVersion: gateway.networking.k8s.io/v1
$kind: Gateway
$metadata:
$ name: nvcf-gateway
$ namespace: envoy-gateway
$ annotations:
$ # --- AWS (EKS) ---
$ service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
$ service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
$ # --- GCP (GKE) - use these instead for GCP ---
$ # cloud.google.com/load-balancer-type: "External"
$ # --- Azure (AKS) - use these instead for Azure ---
$ # service.beta.kubernetes.io/azure-load-balancer-internal: "false"
$spec:
$ gatewayClassName: eg
$ listeners:
$ - name: http
$ protocol: HTTP
$ port: 80
$ allowedRoutes:
$ namespaces:
$ from: Selector
$ selector:
$ matchLabels:
$ nvcf/platform: "true"
$ - name: tcp
$ protocol: TCP
$ port: 10081
$ allowedRoutes:
$ namespaces:
$ from: Selector
$ selector:
$ matchLabels:
$ nvcf/platform: "true"
$EOF

Verify the Gateway is ready and obtain the load balancer address:

$kubectl wait --for=condition=Programmed gateway/nvcf-gateway -n envoy-gateway --timeout=300s
$
$GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
$echo "Gateway Address: $GATEWAY_ADDR"

Save the GATEWAY_ADDR value. You will need it for the Gateway Routes configuration and for verifying API connectivity.

Gateway Routes

The Gateway Routes chart creates HTTPRoutes and TCPRoutes that connect external traffic to NVCF services through the Gateway.

Chartnvcf-gateway-routes
Version1.5.0
Namespaceenvoy-gateway-system
Depends onNotary Service, API Keys (must be running), Gateway (must be programmed)

Configuration

Create gateway-routes-values.yaml (download template):

gateway-routes-values.yaml
1# Gateway Routes values for standalone installation
2# Replace <DOMAIN> with your gateway address (load balancer domain).
3
4nvcfGatewayRoutes:
5 domain: "<DOMAIN>" # e.g. abc123-4567890.us-west-2.elb.amazonaws.com
6 gateways:
7 shared:
8 name: "nvcf-gateway"
9 namespace: "envoy-gateway"
10 grpc:
11 name: "nvcf-gateway"
12 namespace: "envoy-gateway"
13 routes:
14 nvcfApi:
15 routeAnnotations: {}
16 apiKeys:
17 routeAnnotations: {}
18 invocation:
19 routeAnnotations: {}
20 grpc:
21 routeAnnotations: {}

Replace <DOMAIN> with the GATEWAY_ADDR value obtained above.

Install

$helm upgrade --install ingress \
> oci://${REGISTRY}/${REPOSITORY}/nvcf-gateway-routes \
> --version 1.5.0 \
> --namespace envoy-gateway-system \
> --wait --timeout 10m \
> -f gateway-routes-values.yaml

Verify

$kubectl get httproutes -A
$
$# Expected: HTTPRoutes for nvcf-api, api-keys, invocation, etc.
$
$kubectl get tcproutes -A
$
$# Expected: TCPRoute for gRPC proxy

For details on how routing works, verification commands, and production DNS/HTTPS setup, see gateway-routing.

Enable Admin Issuer Proxy Route

The Admin Token Issuer Proxy was installed in standalone-core-services with gateway.enabled: false because the Gateway CRDs did not yet exist. Now that the Gateway is running, upgrade it to enable the admin endpoint HTTPRoute:

$export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
$
$helm upgrade admin-issuer-proxy \
> oci://${REGISTRY}/${REPOSITORY}/helm-admin-token-issuer-proxy \
> --version 1.2.2 \
> --namespace api-keys \
> --wait --timeout 10m \
> --reuse-values \
> --set adminIssuerProxy.gateway.enabled=true \
> --set adminIssuerProxy.gateway.namespace=envoy-gateway \
> --set adminIssuerProxy.gateway.gatewayRef.name=nvcf-gateway \
> --set "adminIssuerProxy.gateway.hostname=api-keys.${GATEWAY_ADDR}" \
> --set adminIssuerProxy.gateway.path=/v1/admin/keys

Verify the admin route was created:

$kubectl get httproutes -A | grep admin
$
$# Expected: admin-token-issuer-proxy HTTPRoute in api-keys namespace

Verify End-to-End Connectivity

With the gateway in place, verify the full stack is functional.

Generate an Admin Token

$export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
$
$export NVCF_TOKEN=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/admin/keys" \
> -H "Host: api-keys.${GATEWAY_ADDR}" \
> | grep -o '"value":"[^"]*"' | cut -d'"' -f4)
$
$echo "Token generated: ${NVCF_TOKEN:0:20}..."

List Functions

$curl -s -X GET "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" | jq .
$
$# Expected: empty list of functions (initial state)
$# Create a test function
$curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" \
> -d '{
> "name": "my-echo-function",
> "inferenceUrl": "/echo",
> "healthUri": "/health",
> "inferencePort": 8000,
> "containerImage": "<YOUR_REGISTRY>/<YOUR_REPO>/load_tester_supreme:0.0.8"
> }' | jq .
$
$# Extract function and version IDs from the response
$export FUNCTION_ID=<function-id-from-response>
$export FUNCTION_VERSION_ID=<version-id-from-response>
$
$# Deploy the function (adjust instanceType and gpu for your cluster)
$curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/deployments/functions/${FUNCTION_ID}/versions/${FUNCTION_VERSION_ID}" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" \
> -d '{
> "deploymentSpecifications": [
> {
> "instanceType": "NCP.GPU.A10G_1x",
> "backend": "nvcf-default",
> "gpu": "A10G",
> "maxInstances": 1,
> "minInstances": 1
> }
> ]
> }' | jq .
$
$# Generate an invocation API key
$EXPIRES_AT=$(date -u -v+1d '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null || date -u -d '+1 day' '+%Y-%m-%dT%H:%M:%SZ')
$SERVICE_ID="nvidia-cloud-functions-ncp-service-id-aketm"
$
$export API_KEY=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/keys" \
> -H "Host: api-keys.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Key-Issuer-Service: nvcf-api" \
> -H "Key-Issuer-Id: ${SERVICE_ID}" \
> -H "Key-Owner-Id: test@nvcf-api.local" \
> -d '{
> "description": "test invocation key",
> "expires_at": "'"${EXPIRES_AT}"'",
> "authorizations": {
> "policies": [{
> "aud": "'"${SERVICE_ID}"'",
> "auds": ["'"${SERVICE_ID}"'"],
> "product": "nv-cloud-functions",
> "resources": [
> {"id": "*", "type": "account-functions"},
> {"id": "*", "type": "authorized-functions"}
> ],
> "scopes": ["invoke_function", "list_functions", "queue_details", "list_functions_details"]
> }]
> },
> "audience_service_ids": ["'"${SERVICE_ID}"'"]
> }' | jq -r '.value')
$
$echo "API Key: ${API_KEY:0:20}..."
$
$# Wait for deployment to be ready, then invoke the function
$curl -s -X POST "http://${GATEWAY_ADDR}/echo" \
> -H "Host: ${FUNCTION_ID}.invocation.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${API_KEY}" \
> -d '{"message": "hello world", "repeats": 1}' | jq .

The backend value should match the cluster group name registered by the NVCA operator. The instanceType and gpu values depend on the GPU types available in your cluster. For invocation, the Host header uses wildcard subdomain routing: <function-id>.invocation.<gateway-addr>.

Uninstalling

To remove all gateway components:

$# Remove Gateway Routes
$helm uninstall ingress -n envoy-gateway-system
$
$# Remove Gateway and GatewayClass resources
$kubectl delete gateway nvcf-gateway -n envoy-gateway --ignore-not-found
$kubectl delete gatewayclass eg --ignore-not-found
$
$# Uninstall Envoy Gateway
$helm uninstall eg -n envoy-gateway-system
$
$# Delete gateway namespaces
$kubectl delete namespace envoy-gateway envoy-gateway-system --ignore-not-found
$
$# (Optional) Remove Gateway API CRDs
$kubectl delete -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/experimental-install.yaml

Next Steps

Your NVCF control plane is now fully installed and accessible. Proceed to self-managed-clusters to install the NVCA Operator and connect your GPU nodes to the control plane.