Required Capabilities#

A compliant implementation needs six things: a confidential runtime class, a measured confidential pod sandbox, a GPU in confidential-computing mode, an attestation verifier, a key-release service, and logs that record what happened without exposing what was processed.

Concretely it must:

Schedule the workload through a confidential runtime class onto approved GPU nodes.
Launch the pod inside a measured confidential VM.
Produce fresh attestation evidence for CPU, GPU, guest, workload image, runtime policy, and firmware state.
Release the model key only after evidence matches policy and includes a fresh verifier-provided nonce.
Keep model keys out of Kubernetes Secrets, node-visible storage, host-visible volumes, and standard pod-management paths.
Keep model artifacts encrypted any time they’re outside the confidential guest.
Fail closed when attestation evidence, reference values, policy, or collateral do not match.
Audit attestation and key release without recording prompts, responses, model weights, keys, or customer data.

Any stack that produces equivalent evidence, enforces equivalent policy, and preserves the same trust boundaries satisfies the architecture. Component responsibilities and interfaces are in Appendix C.