Platform and Hardware Requirements#

Hardware support is tied to a specific validated configuration. Each deployment must document the validated hardware and Kubernetes profile: server model, CPU TEE, GPU SKU and count, firmware, GPU confidential-computing mode, memory, storage, network adapters, management network, node-pool design, runtime configuration, and acceptance tests.

Support status changes quickly. Treat this section as a point-in-time validation profile, not the source of truth for currently supported combinations. For NVIDIA GPU confidential-computing support, start with NVIDIA Confidential Computing, the Secure AI Compatibility Matrix, and the current NVIDIA Confidential Containers supported-platforms documentation. Also check the CC software provider’s product page and support matrix for the selected stack.

Kubernetes support is validated the same way. The validated profile records the Kubernetes platform, confidential runtime, KBS, Attestation Service, GPU Operator, NVIDIA driver, firmware, GPU mode, and RuntimeClass support status for the target deployment.

Before the pod sandbox launches, the worker node checks that the CPU TEE, secure boot, IOMMU, memory encryption, GPU firmware, GPU confidential-computing capability, and the runtime itself are all in the expected state. On AMD that means SEV-SNP enabled in firmware and exposed to the kernel and Kata launch stack; on Intel it means the same for TDX.

The GPU goes into confidential-computing mode before the guest starts. The GPU Operator and node configuration bind devices for the confidential runtime and avoid loading conflicting drivers — some mode transitions require a device reset or host-level privilege to complete.

GPU generation and topology matter. A configuration validated on one GPU generation, form factor, interconnect, driver version, or firmware level does not automatically cover another. PCIe, SXM, NVLink, NVSwitch, MIG, GPU Operator behavior, firmware, and GPU confidential-computing support status can change security and performance behavior.

Licensing can also shape the deployment. If a model provider license is tied to a specific GPU class or cannot span two validated profiles, the validation profile needs to say so.