Workload API - overview

This page describes Workload API concepts with examples.

Workload API provides a way to define Tenant-driven or CloudAdmin-driven rule, which will trigger Pod resource creation in the infrastructure cluster if a workload (Pod), which matches this rule, has started in the Tenant cluster.

A workload is something that is running in the tenant cluster. For example, a workload can represent Kubernetes Pod, Openstack VM, or something else. Currently, the only supported tenant orchestrator is Kubernetes, and the only supported resource type is Pod.

Workload API is a set of APIs; it includes Tenant and CloudAdmin APIs.

Tenant API

  • Workload GRPC API - API to deliver notifications about workloads running in the Tenant cluster to the infrastructure cluster

  • WorkloadRule GRPC API - API to create WorkloadRules in the Infrastructure cluster

  • WorkloadRule CRD - simplify usage of WorkloadRule GRPC API from the tenant cluster

CloudAdmin API

The following infrastructure is used for the demonstration


Infrastructure cluster


✓ icp> kubectl get node NAME STATUS ROLES AGE VERSION dpu1-host-a Ready <none> 27h v1.24.0 dpu1-host-b Ready <none> 27h v1.24.0 dpu1-host-c Ready <none> 27h v1.24.0 dpu1-host-d Ready <none> 27h v1.24.0 icp-master Ready control-plane 27h v1.24.0

Tenant1 cluster


✓ tenant1> kubectl get node NAME STATUS ROLES AGE VERSION host-a Ready <none> 27h v1.24.0 host-b Ready <none> 27h v1.24.0 tenant1-master Ready control-plane 27h v1.24.0

Tenant2 cluster


✓ tenant2> kubectl get node NAME STATUS ROLES AGE VERSION host-c Ready <none> 27h v1.24.0 host-d Ready <none> 27h v1.24.0 tenant2-master Ready control-plane 27h v1.24.0

Universe components

Universe components are deployed to infrastructure and tenant clusters by following Deployment guide.

As a result, in the infrastructure cluster, we have a separate namespace for each tenant


✓ icp> kubectl get ns | grep tenant tenant-tenant1 Active 26h tenant-tenant2 Active 28h

Universe components in infrastructure cluster


✓ icp> kubectl get po -n vault NAME READY STATUS RESTARTS AGE vault-0 1/1 Running 0 28h vault-agent-injector-6fd8f84794-xqlg9 1/1 Running 0 21s ✓ icp> kubectl get po -n universe NAME READY STATUS RESTARTS AGE icp-universe-infra-admin-controller-6c578657ff-5bggg 1/1 Running 0 32s icp-universe-infra-api-gateway-888c7dd8b-bqgpn 2/2 Running 0 31s icp-universe-infra-provisioning-manager-65ddd8d568-c9bzr 1/1 Running 0 32s icp-universe-infra-resource-manager-5cfcd597bc-shk25 1/1 Running 0 32s icp-universe-infra-workload-controller-68f7ffcc77-sfg9c 1/1 Running 0 32s icp-universe-infra-workload-manager-58fdbd88bd-4z7l8 1/1 Running 0 31s icp-universe-infra-workload-rule-manager-7d7686d6cc-56qfz 1/1 Running 0 31s

Tenant1 components


✓ tenant1> kubectl get po -n vault NAME READY STATUS RESTARTS AGE vault-agent-injector-55d7dc8c6f-67kcl 1/1 Running 0 64s ✓ tenant1> kubectl get po -n universe NAME READY STATUS RESTARTS AGE tcp-universe-k8s-tenant-resource-plugin-8455d9cd59-dnl2q 3/3 Running 0 35s tcp-universe-k8s-tenant-workload-plugin-857dcb4b8c-tpn8s 3/3 Running 0 35s tcp-universe-k8s-tenant-workload-rule-plugin-f5bc8d45b-h66tq 3/3 Running 0 35s

Tenant2 components


✓ tenant2> kubectl get po -n vault NAME READY STATUS RESTARTS AGE vault-agent-injector-55d7dc8c6f-6xc52 1/1 Running 0 19s ✓ tenant2> kubectl get po -n universe NAME READY STATUS RESTARTS AGE tcp-universe-k8s-tenant-resource-plugin-8455d9cd59-pqfb6 3/3 Running 0 13s tcp-universe-k8s-tenant-workload-plugin-857dcb4b8c-5qmm5 3/3 Running 0 13s tcp-universe-k8s-tenant-workload-rule-plugin-f5bc8d45b-pp6j2 3/3 Running 0 12s

Here we will test universe resource API in the tenant1 cluster.

Create UVSPod resource


✓ tenant1> cat << 'EOF' | tee tenant1-uvspod1.yaml apiVersion: kind: UVSPod metadata: name: tenant1-uvspod1 namespace: universe spec: object: apiVersion: v1 kind: Pod metadata: name: tenant1-uvspod1 spec: containers: - name: nginx image: nginx:1.14.2 EOF ✓ tenant1> kubectl apply -f tenant1-uvspod1.yaml created

Check UVSPod status in the tenant cluster


✓ tenant1> kubectl get -n universe tenant1-uvspod1 NAME RESULT MESSAGE tenant1-uvspod1 success

If everything operate correctly RESULT should be success

Check that the Pod resource has been created in the tenant namespace in the infrastructure cluster.


✓ icp> kubectl get po -n tenant-tenant1 NAME READY STATUS RESTARTS AGE tenant1-uvspod1 1/1 Running 0 4m16s

In the infrastructure cluster we should see tenant-pod1 with spec from UVSPod which we created in the tenant cluster.

Tenant1 cluster

Create tenant1-pod1 Pod in the default namespace in the tenant1 cluster


✓ tenant1> cat << 'EOF' | tee tenant1-pod1.yaml apiVersion: v1 kind: Pod metadata: name: tenant1-pod1 spec: containers: - name: nginx image: nginx:1.14.2 EOF ✓ tenant1> kubectl apply -f tenant1-pod1.yaml pod/tenant1-pod1 created

Create tenant1-pod2 Pod in the default namespace in the tenant1 cluster


✓ tenant1> cat << 'EOF' | tee tenant1-pod2.yaml apiVersion: v1 kind: Pod metadata: name: tenant1-pod2 spec: containers: - name: nginx image: nginx:1.14.2 EOF ✓ tenant1> kubectl apply -f tenant1-pod2.yaml pod/tenant1-pod2 created

Tenant2 cluster

Create tenant2-pod1 Pod in the default namespace in the tenant2 cluster


✓ tenant2> cat << 'EOF' | tee tenant2-pod1.yaml apiVersion: v1 kind: Pod metadata: name: tenant2-pod1 spec: containers: - name: nginx image: nginx:1.14.2 EOF ✓ tenant2> kubectl apply -f tenant2-pod1.yaml pod/tenant2-pod1 created

Tenant1 cluster

Create tenant1-rule1 WorkloadRule CR in the tenant1 cluster. This rule will match a workload if it runs in the default namespace in the tenant cluster and the workload name is tenant1-pod1.

For matching workloads, the rule will trigger the creation of the Pod, defined in the CR template section (simple nginx pod in our case).

Check WorkloadRule CRD format description for details.


✓ tenant1> cat << 'EOF' | tee tenant1-rule1.yaml apiVersion: kind: WorkloadRule metadata: name: tenant1-rule1 namespace: universe spec: resourceType: v1/Pod workloadTerms: - matchExpressions: - key: metadata.resourceNamespace operator: In values: - default - key: metadata.resourceName operator: In values: - tenant1-pod1 workloadInfoInject: - workloadKey: state.nodeName asAnnotation: name: tenant-node-name - workloadKey: state.extra.labels asAnnotation: name: tenant-workload-labels dpuSelectionPolicy: Any template: apiVersion: v1 kind: Pod spec: containers: - name: nginx image: nginx:1.14.2 volumeMounts: - name: workload-info mountPath: /workload-info - name: workload-labels mountPath: /workload-labels # standard k8s way to mount annotation as a volume volumes: - name: workload-info downwardAPI: items: - path: node-name fieldRef: fieldPath: metadata.annotations['tenant-node-name'] - name: workload-labels downwardAPI: items: - path: labels fieldRef: fieldPath: metadata.annotations['tenant-workload-labels'] EOF ✓ tenant1> kubectl apply -f tenant1-rule1.yaml created

Infrastructure cluster

tenant1-rule1 should match only tenant1-pod1 Pod, we expect that single nginx Pod will be created in tenant-tenant1 namespace in the infrastructure cluster


# tenant1-uvspod1 is a pod which we created earlier ✓ icp> kubectl get po -n tenant-tenant1 NAME READY STATUS RESTARTS AGE tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 63s tenant1-uvspod1 1/1 Running 0 16m # there should be no pods in tenant-tenant2 namespace ✓ icp> kubectl get po -n tenant-tenant2 No resources found in tenant-tenant2 namespace.

You can use the following snippet to check which Pod create by which rule


✓ icp> kubectl get pods -n tenant-tenant1 -o=jsonpath='{range .items[*]}{}{"\t"}{.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name}{"\n"}{end}' tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a tenant1-rule1 tenant1-uvspod1

Also, it is possible to check the workload status in the infrastructure cluster to which rules it matches and which Pods were created for this workload.


✓ icp> kubectl get -n tenant-tenant1 workload-0a7c0d7f-ba7f-4301-afed-8db108dbee1a -o jsonpath={.status} | jq { "rules": { "tenant": [ { "id": "tenant-tenant1/tenant1-rule1", "status": { "objRef": { "apiVersion": "v1", "kind": "Pod", "name": "tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a", "namespace": "tenant-tenant1" }, } } ] } }

Tenant1 cluster

Update Pod template in tenant1-rule1 WorkloadRule


kubectl patch -n universe --type='json' \ -p '[{"op" : "replace","path" : "/spec/template/spec/containers/0/name", "value": "updated"}]' tenant1-rule1

Infrastructure cluster

Pod in the infrastructure cluster should be recreated with updated spec, container should now have name updated


✓ icp> kubectl get pods -n tenant-tenant1 -o=jsonpath='{range .items[*]}{}{"\t"}{.spec.containers[0].name}{"\n"}{end}' tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a updated tenant1-uvspod1 nginx

Now let’s check that workload info injection works as expected. In tenant1-rule1 WorkloadRule, we have a section configuring Workload labels injection for the Pod created in the infra cluster. With the command below, we check the content of the /workload-labels/labels file, which should include workload labels in JSON format. Currently, it should be empty.


# find POD which was create by tenant1-rule1 rule icp > RULE_POD=$(kubectl get pod -n tenant-tenant1 -o jsonpath='{range .items[?(@.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name=="tenant1-rule1")]}{}{"\n"}{end}' | head -n1) # check file content inside the POD icp > kubectl exec -ti -n tenant-tenant1 $RULE_POD -- cat /workload-labels/labels; echo {}

Tenant1 cluster

Update labels for tenant1-pod1 in the tenant1 cluster. Expected that this info will be transferred to the Pod which was created by the tenant1-rule1 WorkloadRule in the infrastructure cluster


tenant1> kubectl label pod tenant1-pod1 foo=bar pod/tenant1-pod1 labeled

Infrastructure cluster

Let’s check that workload labels where injected to the Pod in infrastructure cluster


# find POD which was create by tenant1-rule1 rule icp > RULE_POD=$(kubectl get pod -n tenant-tenant1 -o jsonpath='{range .items[?(@.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name=="tenant1-rule1")]}{}{"\n"}{end}' | head -n1) # check file content inside the POD icp > kubectl exec -ti -n tenant-tenant1 $RULE_POD -- cat /workload-labels/labels; echo {"foo":"bar"}

Tenant1 cluster

Remove resourceName constraint from tenant1-rule1 WorkloadRule


kubectl patch -n universe --type='json' \ -p '[{"op" : "remove","path" : "/spec/workloadTerms/0/matchExpressions/1"}]' tenant1-rule1

Now tenant1-rule1 rule should match all Pods which running in the default namespace in the tenant1 cluster

Infrastructure cluster


✓ icp> kubectl get po -n tenant-tenant1 NAME READY STATUS RESTARTS AGE tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 3m20s tenant1-rule1-10ead831-47b6-407f-a903-c5c4cd92e6e8 1/1 Running 0 3m22s tenant1-uvspod1 1/1 Running 0 51m

Additional Pod was created in the infrastructure cluster as result of tenant1-rule1 match with tenant1-pod2 in the tenant1 cluster.

Tenant1 cluster

Mirror UVSPods should be created in the tenant1 cluster for tenant1-rule1-* Pods.


tenant1> kubectl get -n universe NAME RESULT MESSAGE tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a success tenant1-rule1-10ead831-47b6-407f-a903-c5c4cd92e6e8 success

Now we will remove tenant1-pod2 in the tenant1 cluster.


✓ tenant1> kubectl delete po tenant1-pod2 pod "tenant1-pod2" deleted

Infrastructure cluster

As a result, Pod in the infrastructure cluster, which was created by the tenant1-rule1 rule for tenant1-pod2 Pod should be removed


✓ icp> kubectl get po -n tenant-tenant1 NAME READY STATUS RESTARTS AGE tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 9m6s tenant1-uvspod1 1/1 Running 0 51m

Tenant1 cluster

Remove tenant1-rule1 rule in the tenant1 cluster


✓ tenant1> kubectl delete -n universe tenant1-rule1 "tenant1-rule1" deleted

Infrastructure cluster

As result all Pods create by tenant1-rule1 rule in infrastructure cluster should be removed


✓ icp> kubectl get po -n tenant-tenant1 NAME READY STATUS RESTARTS AGE tenant1-uvspod1 1/1 Running 0 59m

universe.admin.workload.v1 GRPC API documentation

Current state of the clusters is following: * tenant1 cluster has tenant1-pod1 pod * tenant2 cluster has tenant2-pod1 pod


tenant1> kubectl get po NAME READY STATUS RESTARTS AGE tenant1-pod1 1/1 Running 0 3h42m tenant2> kubectl get po NAME READY STATUS RESTARTS AGE tenant2-pod1 1/1 Running 0 3h41m

Now we will define AdminWorkload rule which match Pods from both tenants

Check Manual GRPC API usage doc for instructions how to use CloudAdmin APIs with grpcurl.

From Cloud Admin host


# put base64 encoded Pod spec to RULE_TEMPLATE shel variable RULE_TEMPLATE=$(cat << EOM | base64 -w0 { "apiVersion": "v1", "kind": "Pod", "metadata": { "name": "nginx" }, "spec": { "containers": [ { "name": "nginx", "image": "nginx:1.14.2", "ports": [ { "containerPort": 80 } ] } ] } } EOM ) # -d @ argument for grpcurl mean read arguments from STDIN # use content of RULE_TEMPLATE shell variable as grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d @ -proto universe/admin/workload/v1/admin_workload_rule.proto \ universe.admin.workload.v1.AdminWorkloadRuleService.Create << EOM { "rule": { "id": "adminrule1", "tenant_match": [ "tenant1", "tenant2" ], "data": { "orchestrator_type": 1, "resource_type": "v1/Pod", "dpu_selection_policy": "SameNode", "workload_terms": [ { "match_expressions": [ { "key": "metadata.resourceNamespace", "operation": 1, "values": [ "default" ] } ] } ], "workload_info_inject": [ { "key": "@", "as_annotation": { "name": "full-workload-info" } } ], "rule_template": "$RULE_TEMPLATE" } } } EOM

The command above will create AdminWorkloadRule, which will match workloads(Pods) in the default namespace in both tenant clusters. This rule should match tenant1-pod1 and tenant2-pod1 and create a Pod in the universe namespace in the infrastructure cluster for each.

The AdminWorkloadRule uses "dpu_selection_policy": "SameNode" which means that the Pod created in the infrastructure cluster should start on the DPU, which is installed to the host on which the tenant workload is running.

Infrastructure cluster


icp > kubectl get po -n universe | grep adminrule1 adminrule1-tenant-tenant1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 3m26s adminrule1-tenant-tenant2-6c815148-a769-4271-8c4c-a9485c59cfbd 1/1 Running 0 3m26s

From Cloud Admin host

Remove AdminWorkloadRule adminrule1 and check that related Pods will be removed from the infrastructure cluster


grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \ -d '{"id": "adminrule1"}' \ -proto universe/admin/workload/v1/admin_workload_rule.proto \ universe.admin.workload.v1.AdminWorkloadRuleService.Delete

Infrastructure cluster

All Pods created by adminrule1 should be removed


icp > kubectl get po -n universe | grep adminrule1

