Catalog API - Overview
This page provides Catalog API concepts along with an example.
Catalog API provides a way to manage multiple aspects of Helm:
Define Helm Repositories that host Helm charts. Repositories can be HTTP/HTTPS or OCI based.
Define credentials to access Helm Repositories, if authentication is required.
Define credentials for container images pull secret.
Manage Helm values to be used by Helm Releases.
Manage lifecycle of Helm Releases (Install/Update/Delete)
The implementation leverages Flux to implement the API.
It is available both for Tenant and Admin. Tenants are identified by an additional GRPC header.
For tenants, Helm releases are confined to run only on the tenant allocated DPUs and have limited permissions.
For Admin there are no limited permissions when installing charts. The workloads are not limited to any subset of DPUs or control plane nodes.
Catalog API
Catalog GRPC API - API to install Helm releases in infrastructure cluster
Flux
Flux is an open and extensible continuous delivery solution for Kubernetes.
Flux is a GitOps tool that synchronizes the state of manifests (Helm Release) from a source (Helm Repository) to what is running in a cluster.
The Catalog API creates Flux CRs, that will be handled by Flux controllers in order to install Helm charts.
The following scheme describe the relationship between Flux CRDs and Controllers:
The following infrastructure is used for demonstration
Infrastructure cluster
✓ icp> kubectl get nodes
NAME STATUS ROLES AGE VERSION
dpu1-host-a Ready <none> 23m v1.23.4
dpu1-host-b Ready <none> 23m v1.23.4
dpu1-host-c Ready <none> 23m v1.23.4
dpu1-host-d Ready <none> 23m v1.23.4
icp-master Ready control-plane,master 24m v1.23.4
Universe components
Universe components are deployed to infrastructure and tenant clusters by following Deployment guide.
The Infrastructure Helm chart will automatically create a Helm Repository for CloudAdmin
named ngc-helm-repo
reffering to https://helm.ngc.nvidia.com/nvidia
.
For each tenant in the infrastructure we have a namespace:
✓ icp> kubectl get ns | grep tenant
tenant-tenant1 Active 26h
tenant-tenant2 Active 28h
And the nodes have tenant and host labels:
✓ icp> kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
dpu1-host-a Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-a=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-a,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant1=
dpu1-host-b Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-b=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-b,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant1=
dpu1-host-c Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-c=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-c,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant2=
dpu1-host-d Ready <none> 16m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,hostname.icp.nvidia.com/host-d=,kubernetes.io/arch=amd64,kubernetes.io/hostname=dpu1-host-d,kubernetes.io/os=linux,tenant-id.icp.nvidia.com/tenant2=
icp-master Ready control-plane,master 17m v1.23.4 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=icp-master,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=
Universe components in infrastructure cluster:
✓ icp> kubectl get pods -n universe
NAME READY STATUS RESTARTS AGE
icp-universe-infra-admin-controller-7d9586576c-x85d2 1/1 Running 0 14m
icp-universe-infra-api-gateway-7f96c7c99c-g29xd 1/1 Running 0 14m
icp-universe-infra-catalog-manager-84d5bd7f4c-hmtrg 1/1 Running 0 14m
icp-universe-infra-flux-controller-helm-7f5ccf78b9-sq4f5 1/1 Running 0 14m
icp-universe-infra-flux-controller-source-7bd6c66964-gg2z5 1/1 Running 0 14m
icp-universe-infra-provisioning-manager-5554d8cf96-t68qx 1/1 Running 0 14m
icp-universe-infra-resource-manager-5d4694f88c-lfhs2 1/1 Running 0 14m
icp-universe-infra-workload-controller-5c96658bbf-4dzrk 1/1 Running 0 14m
icp-universe-infra-workload-manager-6bb556c9d8-scvs4 1/1 Running 0 14m
icp-universe-infra-workload-rule-manager-59ff998c65-fzlch 1/1 Running 0 14m
Here we will test the Catalog API by deploying a DOCA service by CloudAdmin in the infrastructure cluster.
Note that in the examples below the gRPC calls are made from Universe control plane nodes over Kubernetes service IP. (In a real deployment, the API Gateway will be accessed externally via Load Balancer IP or NodePort)
The proto file and generated GO client for the API can be found in the universe-api repo (refer to the Manual GRPC API usage document before starting) and use ‘grpcurl’ tool to verify the provisioning API.
First, we will create a Helm repository, pointing to NGC DOCA NVstaging. Since this repository is not publicly available, it is required to provide a NGC API key.
Create Credentials
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"cred": {"name": "doca-secret", "user_name": "$oauthtoken", "password": "<my-api-key>"}}'
-proto universe/catalog/v1/source.proto 10.133.133.1:30001 \
universe.catalog.v1.SourceService.CreateCredential
Create Helm Repository with reference to credentials
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"helm_repository" :{"name": "ngc", "url": "https://helm.ngc.nvidia.com/nvstaging/doca", "credential_ref": {"name": "doca-secret"}}}' \
-proto universe/catalog/v1/source.proto 10.133.133.1:30001 \
universe.catalog.v1.SourceService.CreateHelmRepository
Get Helm Repository
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"name": "ngc"}' \
-proto universe/catalog/v1/source.proto 10.133.133.1:30001 \
universe.catalog.v1.SourceService.GetHelmRepository
Create Image Registry Credentials
In case that a Helm Chart deploys containers which images are hosted on a container registry server that requires authentication, it is possible to create a Docker Registry Secret to pull these images via CreateImageRegistryCredential API.
Note that it will be necessary to specify this secret name as ImagePullSecrets in the values of the HelmRelease.
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"cred": {"name": "doca-nvstaging", "server":"nvcr.io", "user_name": "$oauthtoken", "password": ""<my-api-key>"}}'
-proto universe/catalog/v1/source.proto 10.133.133.1:30001 \
universe.catalog.v1.SourceService.CreateImageRegistryCredential
Create Values
Helm charts can support customization via a values yaml file. In this example, a nodeSelector and a Secret to pull images are added. The fields supported in the Values files are specific to each Helm Chart.
The values file is a binary field in the protobuf message. grpcurl
utility requires
binary fields to be encoded to base64 encoded before they can be used as request parameters.
cat << EOF | tee values.yaml
nodeSelector:
role: storage
imagePullSecrets:
- name: doca-nvstaging
EOF
# put base64 encoded values file to VALUES shell variable
VALUES=$(cat values.yaml | base64 -w0)
# -d @ argument for grpcurl mean read arguments from STDIN
# use content of VALUES shell variable as values
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d @ -proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \
universe.catalog.v1.HelmService.CreateValues << EOM
{"name": "app-values", "values" :"$VALUES"}
EOM
Create Helm Release
Before creating the Helm Release, label three nodes with the `role=storage`
label.
✓ icp> kubectl label nodes dpu1-host-a role=storage
✓ icp> kubectl label nodes dpu1-host-b role=storage
✓ icp> kubectl label nodes dpu1-host-c role=storage
Create the Helm release:
cat << EOF | tee release.json
{
"release":{
"name":"flow-inspector",
"chart":"doca-flow-inspector",
"version":"0.1.0",
"source_ref":{
"name":"ngc"
},
"values_ref":{
"name":"app-values"
}
}
}
EOF
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d "`cat release.json`" \
-proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \
universe.catalog.v1.HelmService.CreateRelease
Get Helm Release
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"name": "flow-inspector"}' \
-proto universe/catalog/v1/helm.proto 10.133.133.1:30001 \
universe.catalog.v1.HelmService.GetRelease
Check that DaemonSet was created:
✓ icp> kubectl get ds -n universe
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
doca-flow-inspector 3 3 0 3 0 <none> 23m
Check that Pods are created (Note that here the Pods are pending due to missing Huge Pages resource on the nodes):
✓ icp> kubectl get pods -n universe | grep doca
doca-flow-inspector-6fcbh 0/1 Pending 0 24m
doca-flow-inspector-6lpqw 0/1 Pending 0 24m
doca-flow-inspector-w4hld 0/1 Pending 0 24m