Phase 3: Gateway and Ingress

View as Markdown

This phase installs the ingress layer that exposes NVCF services externally. It uses shared Gateway API infrastructure plus the NVCF Gateway Routes chart that creates HTTPRoutes and TCPRoutes for each service.

All core services from standalone-core-services must be running before proceeding. The Gateway Routes chart depends on the Notary Service and API Keys being available.

Prepare Gateway API infrastructure

Complete Gateway quickstart before you install the Gateway Routes chart.

Keep these values from the Gateway quickstart:

$echo "$GATEWAY_ADDR"
$echo "$HTTP_GATEWAY_NAMESPACE/$HTTP_GATEWAY_NAME"
$echo "$GRPC_GATEWAY_NAMESPACE/$GRPC_GATEWAY_NAME"

You will use GATEWAY_ADDR in the Gateway Routes values and later connectivity checks.

Gateway Routes

The Gateway Routes chart creates HTTPRoutes and TCPRoutes that connect external traffic to NVCF services through the Gateway.

Chartnvcf-gateway-routes
Version1.11.0
Namespaceenvoy-gateway-system
Depends onNotary Service, API Keys (must be running), Gateway (must be programmed)

Configuration

Create gateway-routes-values.yaml (download template):

gateway-routes-values.yaml
1# Gateway Routes values for standalone installation
2# Replace <DOMAIN> with your gateway address (load balancer domain).
3
4nvcfGatewayRoutes:
5 domain: "<DOMAIN>" # e.g. abc123-4567890.us-west-2.elb.amazonaws.com
6 gateways:
7 shared:
8 name: "nvcf-gateway"
9 namespace: "envoy-gateway"
10 grpc:
11 name: "nvcf-gateway"
12 namespace: "envoy-gateway"
13 routes:
14 nvcfApi:
15 routeAnnotations: {}
16 apiKeys:
17 routeAnnotations: {}
18 invocation:
19 routeAnnotations: {}
20 grpc:
21 routeAnnotations: {}

Replace <DOMAIN> with the GATEWAY_ADDR value obtained above.

Install

$helm upgrade --install ingress \
> oci://${REGISTRY}/${REPOSITORY}/nvcf-gateway-routes \
> --version 1.11.0 \
> --namespace envoy-gateway-system \
> --wait --timeout 10m \
> -f gateway-routes-values.yaml

Verify

$kubectl get httproutes -A
$
$# Expected: HTTPRoutes for nvcf-api, api-keys, invocation, etc.
$
$kubectl get tcproutes -A
$
$# Expected: TCPRoute for gRPC proxy

For details on how routing works, verification commands, and production DNS/HTTPS setup, see gateway-routing.

Enable Admin Issuer Proxy Route

The Admin Token Issuer Proxy was installed in standalone-core-services with gateway.enabled: false because the Gateway CRDs did not yet exist. Now that the Gateway is running, upgrade it to enable the admin endpoint HTTPRoute:

$export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
$
$helm upgrade admin-issuer-proxy \
> oci://${REGISTRY}/${REPOSITORY}/helm-admin-token-issuer-proxy \
> --version 1.3.2 \
> --namespace api-keys \
> --wait --timeout 10m \
> --reuse-values \
> --set adminIssuerProxy.gateway.enabled=true \
> --set adminIssuerProxy.gateway.namespace=envoy-gateway \
> --set adminIssuerProxy.gateway.gatewayRef.name=nvcf-gateway \
> --set "adminIssuerProxy.gateway.hostname=api-keys.${GATEWAY_ADDR}" \
> --set adminIssuerProxy.gateway.path=/v1/admin/keys

Verify the admin route was created:

$kubectl get httproutes -A | grep admin
$
$# Expected: admin-token-issuer-proxy HTTPRoute in api-keys namespace

Verify End-to-End Connectivity

With the gateway in place, verify the full stack is functional.

Generate an Admin Token

$export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
$
$export NVCF_TOKEN=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/admin/keys" \
> -H "Host: api-keys.${GATEWAY_ADDR}" \
> | grep -o '"value":"[^"]*"' | cut -d'"' -f4)
$
$echo "Token generated: ${NVCF_TOKEN:0:20}..."

List Functions

$curl -s -X GET "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" | jq .
$
$# Expected: empty list of functions (initial state)
$# Create a test function
$curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" \
> -d '{
> "name": "my-echo-function",
> "inferenceUrl": "/echo",
> "healthUri": "/health",
> "inferencePort": 8000,
> "containerImage": "<YOUR_REGISTRY>/<YOUR_REPO>/load_tester_supreme:0.0.8"
> }' | jq .
$
$# Extract function and version IDs from the response
$export FUNCTION_ID=<function-id-from-response>
$export FUNCTION_VERSION_ID=<version-id-from-response>
$
$# Deploy the function (adjust instanceType and gpu for your cluster)
$curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/deployments/functions/${FUNCTION_ID}/versions/${FUNCTION_VERSION_ID}" \
> -H "Host: api.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${NVCF_TOKEN}" \
> -d '{
> "deploymentSpecifications": [
> {
> "instanceType": "NCP.GPU.A10G_1x",
> "backend": "nvcf-default",
> "gpu": "A10G",
> "maxInstances": 1,
> "minInstances": 1
> }
> ]
> }' | jq .
$
$# Generate an invocation API key
$EXPIRES_AT=$(date -u -v+1d '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null || date -u -d '+1 day' '+%Y-%m-%dT%H:%M:%SZ')
$SERVICE_ID="nvidia-cloud-functions-ncp-service-id-aketm"
$
$export API_KEY=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/keys" \
> -H "Host: api-keys.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Key-Issuer-Service: nvcf-api" \
> -H "Key-Issuer-Id: ${SERVICE_ID}" \
> -H "Key-Owner-Id: test@nvcf-api.local" \
> -d '{
> "description": "test invocation key",
> "expires_at": "'"${EXPIRES_AT}"'",
> "authorizations": {
> "policies": [{
> "aud": "'"${SERVICE_ID}"'",
> "auds": ["'"${SERVICE_ID}"'"],
> "product": "nv-cloud-functions",
> "resources": [
> {"id": "*", "type": "account-functions"},
> {"id": "*", "type": "authorized-functions"}
> ],
> "scopes": ["invoke_function", "list_functions", "queue_details", "list_functions_details"]
> }]
> },
> "audience_service_ids": ["'"${SERVICE_ID}"'"]
> }' | jq -r '.value')
$
$echo "API Key: ${API_KEY:0:20}..."
$
$# Wait for deployment to be ready, then invoke the function
$curl -s -X POST "http://${GATEWAY_ADDR}/echo" \
> -H "Host: ${FUNCTION_ID}.invocation.${GATEWAY_ADDR}" \
> -H "Content-Type: application/json" \
> -H "Authorization: Bearer ${API_KEY}" \
> -d '{"message": "hello world", "repeats": 1}' | jq .

The backend value should match the cluster group name registered by the NVCA operator. The instanceType and gpu values depend on the GPU types available in your cluster. For invocation, the Host header uses wildcard subdomain routing: <function-id>.invocation.<gateway-addr>. For full HTTP invocation behavior, streaming, and errors, see Generic HTTP Function Invocation.

Uninstalling

To remove all gateway components:

$# Remove Gateway Routes
$helm uninstall ingress -n envoy-gateway-system
$
$# Remove Gateway and GatewayClass resources
$kubectl delete gateway nvcf-gateway -n envoy-gateway --ignore-not-found
$kubectl delete gatewayclass eg --ignore-not-found
$
$# Uninstall Envoy Gateway
$helm uninstall eg -n envoy-gateway-system
$
$# Delete gateway namespaces
$kubectl delete namespace envoy-gateway envoy-gateway-system --ignore-not-found
$
$# (Optional) Remove Gateway API CRDs
$kubectl delete -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/experimental-install.yaml

Next Steps

Your NVCF control plane is now fully installed and accessible. Proceed to Self-Managed Clusters to install the NVCA Operator and connect your GPU nodes to the control plane.