Appendix F. Deployment Detail Checklist#
Implementation References#
Fast-moving implementation detail belongs in product docs and deployment guides. The paper states what the architecture requires; the references provide version-specific steps, commands, support limits, and product workflow details.
Table 16: Implementation References
Area |
External reference |
Use it for |
Keep in this paper |
|---|---|---|---|
NVIDIA Confidential Containers Reference Architecture |
General CoCo architecture, component roles, GPU Operator, Kata Containers, Trustee, and sample CoCo workflow. |
This paper keeps only the confidential-inference trust model and deployment requirements. |
|
NVIDIA Confidential Containers deployment |
Node labeling, Kata installation, GPU Operator configuration, runtime classes, sample workloads, and operational prerequisites. |
The reference implementation must name the validated runtime class, node pool, and configuration profile. |
|
NVIDIA Confidential Containers attestation |
Trustee provisioning, KBS endpoint configuration, CPU/GPU evidence flow, NRAS, policy customization, and troubleshooting. |
The architecture states what must be proven and when model keys can be released. |
|
NVIDIA Confidential Containers supported platforms |
Current GPU, CPU, host OS, kernel, runtime, KBS protocol, GPU Operator, and Kata support status. |
The paper records only the validated profile used by this reference implementation. |
|
Upstream Confidential Containers Trustee |
AA, CDH, KBS, AS, RVPS, KBS protocol, attestation policies, KBS resource policies, and initdata policy binding. |
The paper uses the upstream terms only where they clarify the confidential-inference key-release flow. |
|
NVIDIA confidential computing platform guidance |
Hardware selection, BIOS/firmware prerequisites, CPU/GPU confidential-computing setup, and platform validation guidance. |
The hardware profile is part of the architecture claim and must be recorded per deployment. |
|
Kubernetes RuntimeClass |
RuntimeClass behavior, scheduling overhead, and the Kubernetes mechanism for selecting a configured runtime handler. |
CoCo uses a confidential runtime class to launch pods inside measured confidential VM sandboxes. |
Reference Deployment Bill of Materials#
Table 17: Reference Deployment Bill of Materials
BOM area |
Items to capture |
Owner |
|---|---|---|
Server and accelerator hardware |
Server model/count, CPU TEE, GPU SKU/form factor/count, memory, NVMe, NICs/DPU, switch/fabric role, management network |
OEM/integrator with NVIDIA and platform operator |
Firmware and host stack |
BIOS/UEFI versions, secure boot, IOMMU/VFIO, CPU TEE settings, GPU CC mode, GPU firmware, host OS, kernel |
OEM/integrator and platform operator |
Kubernetes and runtime stack |
Kubernetes platform, confidential runtime, RuntimeClass, node labels, admission/RBAC, network policy |
Platform operator and CC software provider |
Key-release services |
KBS, Attestation Service, policy, KMS/HSM settings, reference values, collateral cache, audit/SIEM integration |
CC software provider and key-release authority |
NVIDIA software |
GPU Operator, driver version, CUDA/runtime, device plugin, GPU CC mode, topology |
NVIDIA and platform operator |
Workload artifacts |
Signed workload image, encrypted model artifact, artifact store, runtime policy, startup service, measurements, key IDs, policy versions |
Model provider and ISV |
Network and identity |
DNS, Route/gateway/load balancer, firewall rules, allowed egress endpoints, TLS certificates, service identities, SIEM targets |
Platform operator and security team |
Validation assets |
Non-sensitive sample key/model, positive/negative tests, expected logs/errors, runbook links |
OEM/integrator and solution architects |