Network Operator Application Notes 23.10.0 - Sphinx Test

Catalog API - Overview

This page provides Catalog API concepts along with an example.

Catalog API provides a way to manage multiple aspects of Helm:

  • Define Helm Repositories that host Helm charts. Repositories can be HTTP/HTTPS or OCI based.

  • Define credentials to access Helm Repositories, if authentication is required.

  • Define credentials for container images pull secret.

  • Manage Helm values to be used by Helm Releases.

  • Manage lifecycle of Helm Releases (Install/Update/Delete)

The implementation leverages Flux to implement the API.

It is available both for Tenant and Admin. Tenants are identified by an additional GRPC header.

For tenants, Helm releases are confined to run only on the tenant allocated DPUs and have limited permissions.

For Admin there are no limited permissions when installing charts. The workloads are not limited to any subset of DPUs or control plane nodes.

Catalog API

Flux

Flux is an open and extensible continuous delivery solution for Kubernetes.

Flux is a GitOps tool that synchronizes the state of manifests (Helm Release) from a source (Helm Repository) to what is running in a cluster.

The Catalog API creates Flux CRs, that will be handled by Flux controllers in order to install Helm charts.

The following scheme describe the relationship between Flux CRDs and Controllers:

The following infrastructure is used for demonstration

Infrastructure cluster

Copy
Copied!
            

✓ icp> kubectl get nodes NAME STATUS ROLES AGE VERSION dpu1-host-a Ready <none> 23m v1.23.4 dpu1-host-b Ready <none> 23m v1.23.4 dpu1-host-c Ready <none> 23m v1.23.4 dpu1-host-d Ready <none> 23m v1.23.4 icp-master Ready control-plane,master 24m v1.23.4

Universe components

Universe components are deployed to infrastructure and tenant clusters by following Deployment guide.

Note

The Infrastructure Helm chart will automatically create a Helm Repository for CloudAdmin named ngc-helm-repo reffering to https://helm.ngc.nvidia.com/nvidia.

For each tenant in the infrastructure we have a namespace:

Copy
Copied!
            

✓ icp> kubectl get ns | grep tenant tenant-tenant1 Active 26h tenant-tenant2 Active 28h

And the nodes have tenant and host labels:

Copy
Copied!
            

✓ icp> kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS dpu1-host-a Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-a=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-a,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant1= dpu1-host-b Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-b=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-b,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant1= dpu1-host-c Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-c=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-c,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant2= dpu1-host-d Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-d=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-d,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant2= icp-master Ready control-plane,master 17m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=icp-master,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=

Universe components in infrastructure cluster:

Copy
Copied!
            

✓ icp> kubectl get pods -n universe NAME READY STATUS RESTARTS AGE icp-universe-infra-admin-controller-7d9586576c-x85d2 1/1 Running 0 14m icp-universe-infra-api-gateway-7f96c7c99c-g29xd 1/1 Running 0 14m icp-universe-infra-catalog-manager-84d5bd7f4c-hmtrg 1/1 Running 0 14m icp-universe-infra-flux-controller-helm-7f5ccf78b9-sq4f5 1/1 Running 0 14m icp-universe-infra-flux-controller-source-7bd6c66964-gg2z5 1/1 Running 0 14m icp-universe-infra-provisioning-manager-5554d8cf96-t68qx 1/1 Running 0 14m icp-universe-infra-resource-manager-5d4694f88c-lfhs2 1/1 Running 0 14m icp-universe-infra-workload-controller-5c96658bbf-4dzrk 1/1 Running 0 14m icp-universe-infra-workload-manager-6bb556c9d8-scvs4 1/1 Running 0 14m icp-universe-infra-workload-rule-manager-59ff998c65-fzlch 1/1 Running 0 14m

Here we will test the Catalog API by deploying a DOCA service by CloudAdmin in the infrastructure cluster.

Note that in the examples below the gRPC calls are made from Universe control plane nodes over Kubernetes service IP. (In a real deployment, the API Gateway will be accessed externally via Load Balancer IP or NodePort)

The proto file and generated GO client for the API can be found in the universe-api repo (refer to the Manual GRPC API usage document before starting) and use ‘grpcurl’ tool to verify the provisioning API.

First, we will create a Helm repository, pointing to NGC DOCA NVstaging. Since this repository is not publicly available, it is required to provide a NGC API key.

Create Credentials

Copy
Copied!
            

grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"cred": {"name": "doca-secret", "user_name": "$oauthtoken", "password": "<my-api-key>"}}' -proto universe/catalog/v1/source.proto 10.133.133.1:30001 \ universe.catalog.v1.SourceService.CreateCredential

Create Helm Repository with reference to credentials

Copy
Copied!
            

grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"helm_repository" :{"name": "ngc", "url": "https://helm.ngc.nvidia.com/nvstaging/doca", "credential_ref": {"name": "doca-secret"}}}' \ -proto universe/catalog/v1/source.proto 10.133.133.1:30001 \ universe.catalog.v1.SourceService.CreateHelmRepository

Get Helm Repository

Copy
Copied!
            

grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"name": "ngc"}' \ -proto universe/catalog/v1/source.proto 10.133.133.1:30001 \ universe.catalog.v1.SourceService.GetHelmRepository

Create Image Registry Credentials

In case that a Helm Chart deploys containers which images are hosted on a container registry server that requires authentication, it is possible to create a Docker Registry Secret to pull these images via CreateImageRegistryCredential API.

Note that it will be necessary to specify this secret name as ImagePullSecrets in the values of the HelmRelease.

Copy
Copied!
            

grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"cred": {"name": "doca-nvstaging", "server":"nvcr.io", "user_name": "$oauthtoken", "password": ""<my-api-key>"}}' -proto universe/catalog/v1/source.proto 10.133.133.1:30001 \ universe.catalog.v1.SourceService.CreateImageRegistryCredential

Create Values

Helm charts can support customization via a values yaml file. In this example, a nodeSelector and a Secret to pull images are added. The fields supported in the Values files are specific to each Helm Chart.

Note

The values file is a binary field in the protobuf message. grpcurl utility requires binary fields to be encoded to base64 encoded before they can be used as request parameters.

Copy
Copied!
            

cat << EOF | tee values.yaml nodeSelector: role: storage imagePullSecrets: - name: doca-nvstaging EOF # put base64 encoded values file to VALUES shell variable VALUES=$(cat values.yaml | base64 -w0) # -d @ argument for grpcurl mean read arguments from STDIN # use content of VALUES shell variable as values grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d @ -proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \ universe.catalog.v1.HelmService.CreateValues << EOM {"name": "app-values", "values" :"$VALUES"} EOM

Create Helm Release

Before creating the Helm Release, label three nodes with the `role=storage` label.

Copy
Copied!
            

✓ icp> kubectl label nodes dpu1-host-a role=storage ✓ icp> kubectl label nodes dpu1-host-b role=storage ✓ icp> kubectl label nodes dpu1-host-c role=storage

Create the Helm release:

Copy
Copied!
            

cat << EOF | tee release.json { "release":{ "name":"flow-inspector", "chart":"doca-flow-inspector", "version":"0.1.0", "source_ref":{ "name":"ngc" }, "values_ref":{ "name":"app-values" } } } EOF grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d "`cat release.json`" \ -proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \ universe.catalog.v1.HelmService.CreateRelease

Get Helm Release

Copy
Copied!
            

grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"name": "flow-inspector"}' \ -proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \ universe.catalog.v1.HelmService.GetRelease

Check that DaemonSet was created:

Copy
Copied!
            

✓ icp> kubectl get ds -n universe NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE doca-flow-inspector 3 3 0 3 0 <none> 23m

Check that Pods are created (Note that here the Pods are pending due to missing Huge Pages resource on the nodes):

Copy
Copied!
            

✓ icp> kubectl get pods -n universe | grep doca doca-flow-inspector-6fcbh 0/1 Pending 0 24m doca-flow-inspector-6lpqw 0/1 Pending 0 24m doca-flow-inspector-w4hld 0/1 Pending 0 24m

Previous Provisioning API - Overview
Next Infrastructure control plane components
© Copyright 2023, NVIDIA. Last updated on Feb 7, 2024.