Kata Sandbox Workloads (Experimental)#

Warning

This feature is experimental and not fully supported. It is included here as a preview for testing environments and is not recommended for production use cases. There may be changes to functionality, implementation, and APIs in future releases. Kata Containers are the foundational technology for extending confidential computing to native Kubernetes deployments. This release adds support for the Kata sandbox. Support for Confidential Containers is planned for future releases of the NIM Operator.

The NIM Operator leverages the NVIDIA GPU Operator to run NIMs inside Kata Containers. This page outlines how to deploy a NIM workload inside a Kata Sandbox container.

Kata Containers are lightweight Virtual Machines (VMs) that act like containers, but still provide the workload isolation and security advantages of VMs. A Kata container runs in a virtual machine on the host. The virtual machine has a separate operating system and kernel. Hardware virtualization and a separate kernel provide improved workload isolation in comparison with traditional containers.

Running NIM inside Kata containers enables lightweight virtualized isolation for enhanced security.

Note

Confidential Containers and NIM Cache deployments have not been tested and are not supported in this release. This use case has been tested for NIM Service deployments using Kata sandbox containers only. Support for Confidential Containers is planned for a future release.

Procedure:

Prerequisites#

  • Make sure KubeletPodResourcesGet is enabled on your cluster. The NVIDIA GPU runtime classes use VFIO cold-plug, which requires the Kata runtime to query Kubelet’s Pod Resources API to discover allocated GPU devices during sandbox creation. For Kubernetes versions older than 1.34, you must explicitly enable the KubeletPodResourcesGet feature gate in your Kubelet configuration. For Kubernetes 1.34 and later, this feature is enabled by default.

  • Deploy Kata runtime and configure the NVIDIA GPU Operator components. Refer to the GPU Operator guide.

  • Set GPU Operator to non-confidential computing mode. To ensure that confidential containers (CC) mode is disabled in the GPU Operator, add the nvidia.com/cc.mode=off label to all GPU nodes. The GPU Operator uses this label to ensure that confidential computing mode is disabled.

    $ kubectl label nodes <node-name> nvidia.com/cc.mode=off
    

    Wait until all GPU Operator operands return to the Running state. This may take several minutes as switching modes triggers restarts of several GPU Operator components.

Install the NIM Operator#

Deploy the NIM Operator onto your cluster following the installation instructions.

Deploy NIM in a Kata Container#

  1. Create a file, llm-kata-sandbox.yaml, based on the sample manifest.

    Note

    This sample creates a NIM service in the nim-service namespace. Ensure the namespace exists and has image pull secrets configured before running the sample.

    # NIMService example: Kata VM sandbox (runtimeClassName: kata-qemu-nvidia-gpu); does *not* enable encryption
    ---
    apiVersion: apps.nvidia.com/v1alpha1
    kind: NIMService
    metadata:
      name: meta-llama-3-2-1b-instruct-kata-sandbox
      namespace: nim-service
    spec:
      image:
        repository: nvcr.io/nim/meta/llama-3.2-1b-instruct
        tag: "1.12.0"
        pullPolicy: IfNotPresent
        pullSecrets:
          - ngc-secret
      authSecret: ngc-api-secret
      storage:
        emptyDir:
          sizeLimit: 10Gi
      replicas: 1
      resources:
        limits:
          nvidia.com/pgpu: "1"
          cpu: "8"
          memory: "16Gi"
      expose:
        service:
          type: ClusterIP
          port: 8000
      runtimeClassName: kata-qemu-nvidia-gpu
      userID: 0
      groupID: 0
    

    The following fields are required to deploy in a Kata container:

    • Set spec.runtimeClassName to kata-qemu-nvidia-gpu. This is the Kata runtime class.

    • Set spec.userID to 0 and spec.groupID to 0.

    • Use the spec.storage.emptyDir field to configure resource requirements for your NIM model.

  2. Apply the manifest:

    $ kubectl apply -f llm-kata-sandbox.yaml
    

Validate NIM Running in Kata Container#

Confirm the NIMService is running in a Kata container by checking the kernel version in the pod. This should differ from the kernel version running on the host where the pod is running.

  1. Retrieve the NIM pod name:

    $ POD=$(kubectl get pods -n nim-service -o name | grep meta-llama-3-2-1b-instruct-kata-sandbox | head -1 | cut -d/ -f2)
    
  2. Verify the pod is running inside a Kata environment:

    $ kubectl exec -it $POD -n nim-service -- uname -a
    

    Example output:

    Linux meta-llama-3-2-1b-instruct-kata-sandbox-5689f9bc67-ljnlf 6.18.12-nvidia-gpu #1 SMP Fri Feb 27 09:33:52 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
    

    Note that if you are using a non‑default Kata VM, the output will display the corresponding kernel.