Workload API - overview
This page describes Workload API concepts with examples.
Workload API provides a way to define Tenant-driven or CloudAdmin-driven rule,
which will trigger Pod resource creation in the infrastructure cluster if a workload
(Pod),
which matches this rule, has started in the Tenant cluster.
A workload
is something that is running in the tenant cluster.
For example, a workload can represent Kubernetes Pod, Openstack VM, or something else.
Currently, the only supported tenant orchestrator is Kubernetes, and the only supported resource type is Pod.
Workload API is a set of APIs; it includes Tenant and CloudAdmin APIs.
Tenant API
Workload GRPC API - API to deliver notifications about workloads running in the Tenant cluster to the infrastructure cluster
WorkloadRule GRPC API - API to create WorkloadRules in the Infrastructure cluster
WorkloadRule CRD - simplify usage of WorkloadRule GRPC API from the tenant cluster
CloudAdmin API
AdminWorkloadRule GRPC API - API to create AdminWorkloadRules in the infrastructure cluster
The following infrastructure is used for the demonstration
Infrastructure cluster
✓ icp> kubectl get node
NAME STATUS ROLES AGE VERSION
dpu1-host-a Ready <none> 27h v1.24.0
dpu1-host-b Ready <none> 27h v1.24.0
dpu1-host-c Ready <none> 27h v1.24.0
dpu1-host-d Ready <none> 27h v1.24.0
icp-master Ready control-plane 27h v1.24.0
Tenant1 cluster
✓ tenant1> kubectl get node
NAME STATUS ROLES AGE VERSION
host-a Ready <none> 27h v1.24.0
host-b Ready <none> 27h v1.24.0
tenant1-master Ready control-plane 27h v1.24.0
Tenant2 cluster
✓ tenant2> kubectl get node
NAME STATUS ROLES AGE VERSION
host-c Ready <none> 27h v1.24.0
host-d Ready <none> 27h v1.24.0
tenant2-master Ready control-plane 27h v1.24.0
Universe components
Universe components are deployed to infrastructure and tenant clusters by following Deployment guide.
As a result, in the infrastructure cluster, we have a separate namespace for each tenant
✓ icp> kubectl get ns | grep tenant
tenant-tenant1 Active 26h
tenant-tenant2 Active 28h
Universe components in infrastructure cluster
✓ icp> kubectl get po -n vault
NAME READY STATUS RESTARTS AGE
vault-0 1/1 Running 0 28h
vault-agent-injector-6fd8f84794-xqlg9 1/1 Running 0 21s
✓ icp> kubectl get po -n universe
NAME READY STATUS RESTARTS AGE
icp-universe-infra-admin-controller-6c578657ff-5bggg 1/1 Running 0 32s
icp-universe-infra-api-gateway-888c7dd8b-bqgpn 2/2 Running 0 31s
icp-universe-infra-provisioning-manager-65ddd8d568-c9bzr 1/1 Running 0 32s
icp-universe-infra-resource-manager-5cfcd597bc-shk25 1/1 Running 0 32s
icp-universe-infra-workload-controller-68f7ffcc77-sfg9c 1/1 Running 0 32s
icp-universe-infra-workload-manager-58fdbd88bd-4z7l8 1/1 Running 0 31s
icp-universe-infra-workload-rule-manager-7d7686d6cc-56qfz 1/1 Running 0 31s
Tenant1 components
✓ tenant1> kubectl get po -n vault
NAME READY STATUS RESTARTS AGE
vault-agent-injector-55d7dc8c6f-67kcl 1/1 Running 0 64s
✓ tenant1> kubectl get po -n universe
NAME READY STATUS RESTARTS AGE
tcp-universe-k8s-tenant-resource-plugin-8455d9cd59-dnl2q 3/3 Running 0 35s
tcp-universe-k8s-tenant-workload-plugin-857dcb4b8c-tpn8s 3/3 Running 0 35s
tcp-universe-k8s-tenant-workload-rule-plugin-f5bc8d45b-h66tq 3/3 Running 0 35s
Tenant2 components
✓ tenant2> kubectl get po -n vault
NAME READY STATUS RESTARTS AGE
vault-agent-injector-55d7dc8c6f-6xc52 1/1 Running 0 19s
✓ tenant2> kubectl get po -n universe
NAME READY STATUS RESTARTS AGE
tcp-universe-k8s-tenant-resource-plugin-8455d9cd59-pqfb6 3/3 Running 0 13s
tcp-universe-k8s-tenant-workload-plugin-857dcb4b8c-5qmm5 3/3 Running 0 13s
tcp-universe-k8s-tenant-workload-rule-plugin-f5bc8d45b-pp6j2 3/3 Running 0 12s
Here we will test universe resource API in the tenant1 cluster.
Create UVSPod resource
✓ tenant1> cat << 'EOF' | tee tenant1-uvspod1.yaml
apiVersion: resource.universe.nvidia.com/v1alpha1
kind: UVSPod
metadata:
name: tenant1-uvspod1
namespace: universe
spec:
object:
apiVersion: v1
kind: Pod
metadata:
name: tenant1-uvspod1
spec:
containers:
- name: nginx
image: nginx:1.14.2
EOF
✓ tenant1> kubectl apply -f tenant1-uvspod1.yaml
uvspod.resource.universe.nvidia.com/tenant1-uvspod1 created
Check UVSPod status in the tenant cluster
✓ tenant1> kubectl get uvspods.resource.universe.nvidia.com -n universe tenant1-uvspod1
NAME RESULT MESSAGE
tenant1-uvspod1 success
If everything operate correctly RESULT
should be success
Check that the Pod resource has been created in the tenant namespace in the infrastructure cluster.
✓ icp> kubectl get po -n tenant-tenant1
NAME READY STATUS RESTARTS AGE
tenant1-uvspod1 1/1 Running 0 4m16s
In the infrastructure cluster we should see tenant-pod1 with spec from UVSPod which we created in the tenant cluster.
Tenant1 cluster
Create tenant1-pod1
Pod in the default namespace in the tenant1 cluster
✓ tenant1> cat << 'EOF' | tee tenant1-pod1.yaml
apiVersion: v1
kind: Pod
metadata:
name: tenant1-pod1
spec:
containers:
- name: nginx
image: nginx:1.14.2
EOF
✓ tenant1> kubectl apply -f tenant1-pod1.yaml
pod/tenant1-pod1 created
Create tenant1-pod2
Pod in the default namespace in the tenant1 cluster
✓ tenant1> cat << 'EOF' | tee tenant1-pod2.yaml
apiVersion: v1
kind: Pod
metadata:
name: tenant1-pod2
spec:
containers:
- name: nginx
image: nginx:1.14.2
EOF
✓ tenant1> kubectl apply -f tenant1-pod2.yaml
pod/tenant1-pod2 created
Tenant2 cluster
Create tenant2-pod1
Pod in the default namespace in the tenant2 cluster
✓ tenant2> cat << 'EOF' | tee tenant2-pod1.yaml
apiVersion: v1
kind: Pod
metadata:
name: tenant2-pod1
spec:
containers:
- name: nginx
image: nginx:1.14.2
EOF
✓ tenant2> kubectl apply -f tenant2-pod1.yaml
pod/tenant2-pod1 created
Tenant1 cluster
Create tenant1-rule1
WorkloadRule CR in the tenant1 cluster.
This rule will match a workload if it runs in the default namespace in
the tenant cluster and the workload name is tenant1-pod1
.
For matching workloads, the rule will trigger the creation of the Pod, defined in the CR template section (simple nginx pod in our case).
Check WorkloadRule CRD format description for details.
✓ tenant1> cat << 'EOF' | tee tenant1-rule1.yaml
apiVersion: workload.universe.nvidia.com/v1alpha1
kind: WorkloadRule
metadata:
name: tenant1-rule1
namespace: universe
spec:
resourceType: v1/Pod
workloadTerms:
- matchExpressions:
- key: metadata.resourceNamespace
operator: In
values:
- default
- key: metadata.resourceName
operator: In
values:
- tenant1-pod1
workloadInfoInject:
- workloadKey: state.nodeName
asAnnotation:
name: tenant-node-name
- workloadKey: state.extra.labels
asAnnotation:
name: tenant-workload-labels
dpuSelectionPolicy: Any
template:
apiVersion: v1
kind: Pod
spec:
containers:
- name: nginx
image: nginx:1.14.2
volumeMounts:
- name: workload-info
mountPath: /workload-info
- name: workload-labels
mountPath: /workload-labels
# standard k8s way to mount annotation as a volume
volumes:
- name: workload-info
downwardAPI:
items:
- path: node-name
fieldRef:
fieldPath: metadata.annotations['tenant-node-name']
- name: workload-labels
downwardAPI:
items:
- path: labels
fieldRef:
fieldPath: metadata.annotations['tenant-workload-labels']
EOF
✓ tenant1> kubectl apply -f tenant1-rule1.yaml
workloadrule.workload.universe.nvidia.com/tenant1-rule1 created
Infrastructure cluster
tenant1-rule1
should match only tenant1-pod1
Pod, we expect that single
nginx Pod will be created in tenant-tenant1
namespace in the infrastructure cluster
# tenant1-uvspod1 is a pod which we created earlier
✓ icp> kubectl get po -n tenant-tenant1
NAME READY STATUS RESTARTS AGE
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 63s
tenant1-uvspod1 1/1 Running 0 16m
# there should be no pods in tenant-tenant2 namespace
✓ icp> kubectl get po -n tenant-tenant2
No resources found in tenant-tenant2 namespace.
You can use the following snippet to check which Pod create by which rule
✓ icp> kubectl get pods -n tenant-tenant1 -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name}{"\n"}{end}'
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a tenant1-rule1
tenant1-uvspod1
Also, it is possible to check the workload status in the infrastructure cluster to which rules it matches and which Pods were created for this workload.
✓ icp> kubectl get -n tenant-tenant1 workloads.workload.infra.universe.nvidia.com workload-0a7c0d7f-ba7f-4301-afed-8db108dbee1a -o jsonpath={.status} | jq
{
"rules": {
"tenant": [
{
"id": "tenant-tenant1/tenant1-rule1",
"status": {
"objRef": {
"apiVersion": "v1",
"kind": "Pod",
"name": "tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a",
"namespace": "tenant-tenant1"
},
}
}
]
}
}
Tenant1 cluster
Update Pod template in tenant1-rule1
WorkloadRule
kubectl patch workloadrules.workload.universe.nvidia.com -n universe --type='json' \
-p '[{"op" : "replace","path" : "/spec/template/spec/containers/0/name", "value": "updated"}]' tenant1-rule1
Infrastructure cluster
Pod in the infrastructure cluster should be recreated with updated spec, container should now have name updated
✓ icp> kubectl get pods -n tenant-tenant1 -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].name}{"\n"}{end}'
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a updated
tenant1-uvspod1 nginx
Now let’s check that workload info injection works as expected. In tenant1-rule1
WorkloadRule,
we have a section configuring Workload labels injection for the Pod created in the infra cluster.
With the command below, we check the content of the /workload-labels/labels
file,
which should include workload labels in JSON format. Currently, it should be empty.
# find POD which was create by tenant1-rule1 rule
icp > RULE_POD=$(kubectl get pod -n tenant-tenant1 -o jsonpath='{range .items[?(@.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name=="tenant1-rule1")]}{ .metadata.name}{"\n"}{end}' | head -n1)
# check file content inside the POD
icp > kubectl exec -ti -n tenant-tenant1 $RULE_POD -- cat /workload-labels/labels; echo
{}
Tenant1 cluster
Update labels for tenant1-pod1
in the tenant1 cluster.
Expected that this info will be transferred to the Pod which was created by the tenant1-rule1
WorkloadRule in the infrastructure cluster
tenant1> kubectl label pod tenant1-pod1 foo=bar
pod/tenant1-pod1 labeled
Infrastructure cluster
Let’s check that workload labels where injected to the Pod in infrastructure cluster
# find POD which was create by tenant1-rule1 rule
icp > RULE_POD=$(kubectl get pod -n tenant-tenant1 -o jsonpath='{range .items[?(@.metadata.annotations.workloadrule\.workload\.universe\.nvidia\.com/name=="tenant1-rule1")]}{ .metadata.name}{"\n"}{end}' | head -n1)
# check file content inside the POD
icp > kubectl exec -ti -n tenant-tenant1 $RULE_POD -- cat /workload-labels/labels; echo
{"foo":"bar"}
Tenant1 cluster
Remove resourceName constraint from tenant1-rule1
WorkloadRule
kubectl patch workloadrules.workload.universe.nvidia.com -n universe --type='json' \
-p '[{"op" : "remove","path" : "/spec/workloadTerms/0/matchExpressions/1"}]' tenant1-rule1
Now tenant1-rule1
rule should match all Pods which running in the default namespace in the tenant1 cluster
Infrastructure cluster
✓ icp> kubectl get po -n tenant-tenant1
NAME READY STATUS RESTARTS AGE
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 3m20s
tenant1-rule1-10ead831-47b6-407f-a903-c5c4cd92e6e8 1/1 Running 0 3m22s
tenant1-uvspod1 1/1 Running 0 51m
Additional Pod was created in the infrastructure cluster as result of tenant1-rule1
match with tenant1-pod2
in the tenant1 cluster.
Tenant1 cluster
Mirror UVSPods should be created in the tenant1 cluster for tenant1-rule1-*
Pods.
tenant1> kubectl get uvspods.resource.universe.nvidia.com -n universe
NAME RESULT MESSAGE
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a success
tenant1-rule1-10ead831-47b6-407f-a903-c5c4cd92e6e8 success
Now we will remove tenant1-pod2
in the tenant1 cluster.
✓ tenant1> kubectl delete po tenant1-pod2
pod "tenant1-pod2" deleted
Infrastructure cluster
As a result, Pod in the infrastructure cluster, which was created by
the tenant1-rule1
rule for tenant1-pod2
Pod should be removed
✓ icp> kubectl get po -n tenant-tenant1
NAME READY STATUS RESTARTS AGE
tenant1-rule1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 9m6s
tenant1-uvspod1 1/1 Running 0 51m
Tenant1 cluster
Remove tenant1-rule1
rule in the tenant1 cluster
✓ tenant1> kubectl delete -n universe workloadrules.workload.universe.nvidia.com tenant1-rule1
workloadrule.workload.universe.nvidia.com "tenant1-rule1" deleted
Infrastructure cluster
As result all Pods create by tenant1-rule1
rule in infrastructure cluster should be removed
✓ icp> kubectl get po -n tenant-tenant1
NAME READY STATUS RESTARTS AGE
tenant1-uvspod1 1/1 Running 0 59m
universe.admin.workload.v1 GRPC API documentation
Current state of the clusters is following:
* tenant1 cluster has tenant1-pod1
pod
* tenant2 cluster has tenant2-pod1
pod
tenant1> kubectl get po
NAME READY STATUS RESTARTS AGE
tenant1-pod1 1/1 Running 0 3h42m
tenant2> kubectl get po
NAME READY STATUS RESTARTS AGE
tenant2-pod1 1/1 Running 0 3h41m
Now we will define AdminWorkload rule which match Pods from both tenants
Check Manual GRPC API usage doc for instructions how
to use CloudAdmin APIs with grpcurl
.
From Cloud Admin host
# put base64 encoded Pod spec to RULE_TEMPLATE shel variable
RULE_TEMPLATE=$(cat << EOM | base64 -w0
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "nginx"
},
"spec": {
"containers": [
{
"name": "nginx",
"image": "nginx:1.14.2",
"ports": [
{
"containerPort": 80
}
]
}
]
}
}
EOM
)
# -d @ argument for grpcurl mean read arguments from STDIN
# use content of RULE_TEMPLATE shell variable as rule.data.rule_template
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d @ -proto universe/admin/workload/v1/admin_workload_rule.proto 10.133.133.1:30001 \
universe.admin.workload.v1.AdminWorkloadRuleService.Create << EOM
{
"rule": {
"id": "adminrule1",
"tenant_match": [
"tenant1", "tenant2"
],
"data": {
"orchestrator_type": 1,
"resource_type": "v1/Pod",
"dpu_selection_policy": "SameNode",
"workload_terms": [
{
"match_expressions": [
{
"key": "metadata.resourceNamespace",
"operation": 1,
"values": [
"default"
]
}
]
}
],
"workload_info_inject": [
{
"key": "@",
"as_annotation": {
"name": "full-workload-info"
}
}
],
"rule_template": "$RULE_TEMPLATE"
}
}
}
EOM
The command above will create AdminWorkloadRule, which will match workloads(Pods) in the default namespace in both tenant clusters.
This rule should match tenant1-pod1
and tenant2-pod1
and create a Pod in the universe namespace in the infrastructure cluster for each.
The AdminWorkloadRule uses "dpu_selection_policy": "SameNode"
which means that the Pod created in
the infrastructure cluster should start on the DPU, which is installed to the host on which the tenant workload is running.
Infrastructure cluster
icp > kubectl get po -n universe | grep adminrule1
adminrule1-tenant-tenant1-0a7c0d7f-ba7f-4301-afed-8db108dbee1a 1/1 Running 0 3m26s
adminrule1-tenant-tenant2-6c815148-a769-4271-8c4c-a9485c59cfbd 1/1 Running 0 3m26s
From Cloud Admin host
Remove AdminWorkloadRule adminrule1
and check that related Pods will be removed from the infrastructure cluster
grpcurl -cacert=ca.crt -cert=admin.crt -key=admin.key -servername api-gateway.local \
-d '{"id": "adminrule1"}' \
-proto universe/admin/workload/v1/admin_workload_rule.proto 10.133.133.1:30001 \
universe.admin.workload.v1.AdminWorkloadRuleService.Delete
Infrastructure cluster
All Pods created by adminrule1
should be removed
icp > kubectl get po -n universe | grep adminrule1