Phase 3: Gateway and Ingress#
This phase installs the ingress layer that exposes NVCF services externally. It consists of two parts: the Envoy Gateway infrastructure and the NVCF Gateway Routes chart that creates HTTPRoutes and TCPRoutes for each service.
Important
All core services from Phase 2: Core Services must be running before proceeding. The Gateway Routes chart depends on the Notary Service and API Keys being available.
Install Kubernetes Gateway CRDs#
Install the Kubernetes Gateway API CRDs if not already present on your cluster:
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/experimental-install.yaml
Note
If replacing the version (v1.2.0), ensure compatibility with the GatewayClass and Gateway resources created below.
Install Envoy Gateway#
Install Envoy Gateway as the Gateway API controller:
helm install eg oci://docker.io/envoyproxy/gateway-helm \
--version v1.1.3 \
-n envoy-gateway-system
Verify the Envoy Gateway pods are running:
kubectl get pods -n envoy-gateway-system
# Expected: envoy-gateway pod Running
Create GatewayClass#
Create the GatewayClass resource that binds to the Envoy Gateway controller:
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: eg
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
EOF
Create Gateway#
Create the Gateway resource that provisions the external load balancer.
Note
The annotations section is cloud-provider specific and controls how the external
load balancer is provisioned:
AWS (EKS): Creates an internet-facing Network Load Balancer
GCP (GKE): Creates an external HTTP(S) load balancer
Azure (AKS): Creates a public load balancer
On-prem: Requires a load balancer solution like MetalLB, or use NodePort. Consult your infrastructure documentation.
kubectl apply -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: nvcf-gateway
namespace: envoy-gateway
annotations:
# --- AWS (EKS) ---
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
# --- GCP (GKE) - use these instead for GCP ---
# cloud.google.com/load-balancer-type: "External"
# --- Azure (AKS) - use these instead for Azure ---
# service.beta.kubernetes.io/azure-load-balancer-internal: "false"
spec:
gatewayClassName: eg
listeners:
- name: http
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
nvcf/platform: "true"
- name: tcp
protocol: TCP
port: 10081
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
nvcf/platform: "true"
EOF
Verify the Gateway is ready and obtain the load balancer address:
kubectl wait --for=condition=Programmed gateway/nvcf-gateway -n envoy-gateway --timeout=300s
GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
echo "Gateway Address: $GATEWAY_ADDR"
Important
Save the GATEWAY_ADDR value. You will need it for the Gateway Routes configuration
and for verifying API connectivity.
Gateway Routes#
The Gateway Routes chart creates HTTPRoutes and TCPRoutes that connect external traffic to NVCF services through the Gateway.
Chart |
|
Version |
|
Namespace |
|
Depends on |
Notary Service, API Keys (must be running), Gateway (must be programmed) |
Configuration#
Create gateway-routes-values.yaml (download template):
gateway-routes-values.yaml
# Gateway Routes values for standalone installation
# Replace <DOMAIN> with your gateway address (load balancer domain).
nvcfGatewayRoutes:
domain: "<DOMAIN>" # e.g. abc123-4567890.us-west-2.elb.amazonaws.com
gateways:
shared:
name: "nvcf-gateway"
namespace: "envoy-gateway"
grpc:
name: "nvcf-gateway"
namespace: "envoy-gateway"
routes:
nvcfApi:
routeAnnotations: {}
apiKeys:
routeAnnotations: {}
invocation:
routeAnnotations: {}
grpc:
routeAnnotations: {}
Replace <DOMAIN> with the GATEWAY_ADDR value obtained above.
Install#
helm upgrade --install ingress \
oci://${REGISTRY}/${REPOSITORY}/nvcf-gateway-routes \
--version 1.5.0 \
--namespace envoy-gateway-system \
--wait --timeout 10m \
-f gateway-routes-values.yaml
Verify#
kubectl get httproutes -A
# Expected: HTTPRoutes for nvcf-api, api-keys, invocation, etc.
kubectl get tcproutes -A
# Expected: TCPRoute for gRPC proxy
See also
For details on how routing works, verification commands, and production DNS/HTTPS setup, see Gateway Routing and DNS.
Enable Admin Issuer Proxy Route#
The Admin Token Issuer Proxy was installed in Phase 2: Core Services with
gateway.enabled: false because the Gateway CRDs did not yet exist. Now that the Gateway
is running, upgrade it to enable the admin endpoint HTTPRoute:
export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
helm upgrade admin-issuer-proxy \
oci://${REGISTRY}/${REPOSITORY}/helm-admin-token-issuer-proxy \
--version 1.2.1 \
--namespace api-keys \
--wait --timeout 10m \
--reuse-values \
--set adminIssuerProxy.gateway.enabled=true \
--set adminIssuerProxy.gateway.namespace=envoy-gateway \
--set adminIssuerProxy.gateway.gatewayRef.name=nvcf-gateway \
--set "adminIssuerProxy.gateway.hostname=api-keys.${GATEWAY_ADDR}" \
--set adminIssuerProxy.gateway.path=/v1/admin/keys
Verify the admin route was created:
kubectl get httproutes -A | grep admin
# Expected: admin-token-issuer-proxy HTTPRoute in api-keys namespace
Verify End-to-End Connectivity#
With the gateway in place, verify the full stack is functional.
Generate an Admin Token#
export GATEWAY_ADDR=$(kubectl get gateway nvcf-gateway -n envoy-gateway -o jsonpath='{.status.addresses[0].value}')
export NVCF_TOKEN=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/admin/keys" \
-H "Host: api-keys.${GATEWAY_ADDR}" \
| grep -o '"value":"[^"]*"' | cut -d'"' -f4)
echo "Token generated: ${NVCF_TOKEN:0:20}..."
List Functions#
curl -s -X GET "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
-H "Host: api.${GATEWAY_ADDR}" \
-H "Authorization: Bearer ${NVCF_TOKEN}" | jq .
# Expected: empty list of functions (initial state)
(Optional) Create, Deploy, and Invoke a Test Function
# Create a test function
curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/functions" \
-H "Host: api.${GATEWAY_ADDR}" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${NVCF_TOKEN}" \
-d '{
"name": "my-echo-function",
"inferenceUrl": "/echo",
"healthUri": "/health",
"inferencePort": 8000,
"containerImage": "<YOUR_REGISTRY>/<YOUR_REPO>/load_tester_supreme:0.0.8"
}' | jq .
# Extract function and version IDs from the response
export FUNCTION_ID=<function-id-from-response>
export FUNCTION_VERSION_ID=<version-id-from-response>
# Deploy the function (adjust instanceType and gpu for your cluster)
curl -s -X POST "http://${GATEWAY_ADDR}/v2/nvcf/deployments/functions/${FUNCTION_ID}/versions/${FUNCTION_VERSION_ID}" \
-H "Host: api.${GATEWAY_ADDR}" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${NVCF_TOKEN}" \
-d '{
"deploymentSpecifications": [
{
"instanceType": "NCP.GPU.A10G_1x",
"backend": "nvcf-default",
"gpu": "A10G",
"maxInstances": 1,
"minInstances": 1
}
]
}' | jq .
# Generate an invocation API key
EXPIRES_AT=$(date -u -v+1d '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null || date -u -d '+1 day' '+%Y-%m-%dT%H:%M:%SZ')
SERVICE_ID="nvidia-cloud-functions-ncp-service-id-aketm"
export API_KEY=$(curl -s -X POST "http://${GATEWAY_ADDR}/v1/keys" \
-H "Host: api-keys.${GATEWAY_ADDR}" \
-H "Content-Type: application/json" \
-H "Key-Issuer-Service: nvcf-api" \
-H "Key-Issuer-Id: ${SERVICE_ID}" \
-H "Key-Owner-Id: test@nvcf-api.local" \
-d '{
"description": "test invocation key",
"expires_at": "'"${EXPIRES_AT}"'",
"authorizations": {
"policies": [{
"aud": "'"${SERVICE_ID}"'",
"auds": ["'"${SERVICE_ID}"'"],
"product": "nv-cloud-functions",
"resources": [
{"id": "*", "type": "account-functions"},
{"id": "*", "type": "authorized-functions"}
],
"scopes": ["invoke_function", "list_functions", "queue_details", "list_functions_details"]
}]
},
"audience_service_ids": ["'"${SERVICE_ID}"'"]
}' | jq -r '.value')
echo "API Key: ${API_KEY:0:20}..."
# Wait for deployment to be ready, then invoke the function
curl -s -X POST "http://${GATEWAY_ADDR}/echo" \
-H "Host: ${FUNCTION_ID}.invocation.${GATEWAY_ADDR}" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{"message": "hello world", "repeats": 1}' | jq .
Note
The backend value should match the cluster group name registered by the NVCA operator.
The instanceType and gpu values depend on the GPU types available in your cluster.
For invocation, the Host header uses wildcard subdomain routing:
<function-id>.invocation.<gateway-addr>.
Uninstalling#
To remove all gateway components:
# Remove Gateway Routes
helm uninstall ingress -n envoy-gateway-system
# Remove Gateway and GatewayClass resources
kubectl delete gateway nvcf-gateway -n envoy-gateway --ignore-not-found
kubectl delete gatewayclass eg --ignore-not-found
# Uninstall Envoy Gateway
helm uninstall eg -n envoy-gateway-system
# Delete gateway namespaces
kubectl delete namespace envoy-gateway envoy-gateway-system --ignore-not-found
# (Optional) Remove Gateway API CRDs
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.0/experimental-install.yaml
Next Steps#
Your NVCF control plane is now fully installed and accessible. Proceed to Self-Managed Clusters to install the NVCA Operator and connect your GPU nodes to the control plane.