GPU Operator with MIG#

About Multi-Instance GPU#

Multi-Instance GPU (MIG) enables GPUs based on the NVIDIA Ampere and later architectures, such as NVIDIA A100, to be partitioned into separate and secure GPU instances for CUDA applications. Refer to the MIG User Guide for more information about MIG.

GPU Operator deploys MIG Manager to manage MIG configuration on nodes in your Kubernetes cluster. You must enable MIG during installation by choosing a MIG strategy before you can configure MIG.

Refer to the architecture section for more information about how MIG is implemented in the GPU Operator.

Enabling MIG During Installation#

Use the following steps to enable MIG and deploy MIG Manager.

  1. Install the Operator:

    $ helm install --wait --generate-name \
        -n gpu-operator --create-namespace \
        nvidia/gpu-operator \
        --version=v26.3.0 \
        --set mig.strategy=single
    

    This example sets single as the MIG strategy. Available MIG strategy options:

    • single: MIG mode is enabled on all GPUs on a node.

    • mixed: MIG mode is not enabled on all GPUs on a node.

    In a cloud service provider (CSP) environment such as Google Cloud, also specify --set migManager.env[0].name=WITH_REBOOT --set-string migManager.env[0].value=true to ensure that the node reboots and can apply the MIG configuration.

    MIG Manager supports preinstalled drivers, meaning drivers that are not managed by the GPU Operator and you installed directly on the host. If drivers are preinstalled, also specify --set driver.enabled=false. Refer to MIG Manager with Preinstalled Drivers for more details.

    After several minutes, all GPU Operator pods, including the nvidia-mig-manager are deployed on nodes that have MIG capable GPUs.

    Note

    MIG Manager requires that no user workloads are running on the GPUs being configured. In some cases, the node might need to be rebooted, such as a CSP, so the node might need to be cordoned before changing the MIG mode or the MIG geometry on the GPUs.

  2. Optional: Display the pods in the Operator namespace:

    $ kubectl get pods -n gpu-operator
    

    Example Output

    NAME                                                          READY   STATUS      RESTARTS   AGE
    gpu-feature-discovery-qmwb2                                   1/1     Running     0          14m
    gpu-operator-7bbf8bb6b7-xz664                                 1/1     Running     0          14m
    gpu-operator-node-feature-discovery-gc-79d6d968bb-sg4t6       1/1     Running     0          14m
    gpu-operator-node-feature-discovery-master-6d9f8d497c-7cwrp   1/1     Running     0          14m
    gpu-operator-node-feature-discovery-worker-x5z62              1/1     Running     0          14m
    nvidia-container-toolkit-daemonset-pkcpr                      1/1     Running     0          14m
    nvidia-cuda-validator-wt6bc                                   0/1     Completed   0          12m
    nvidia-dcgm-exporter-zsskv                                    1/1     Running     0          14m
    nvidia-device-plugin-daemonset-924x6                          1/1     Running     0          14m
    nvidia-driver-daemonset-klj5s                                 1/1     Running     0          14m
    nvidia-mig-manager-8d6wz                                      1/1     Running     0          12m
    nvidia-operator-validator-fnsmk                               1/1     Running     0          14m
    
  3. Optional: Display the labels applied to the node:

    $ kubectl get node -o json | jq '.items[].metadata.labels'
    

    Partial Output

      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-disabled",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "single",
      "nvidia.com/mps.capable": "false"
    }
    

Configuring MIG Profiles#

When MIG is enabled, nodes are labeled with nvidia.com/mig.config: all-disabled by default. To use a profile on a node, update the label value with the desired profile, for example, nvidia.com/mig.config=all-1g.10gb.

Introduced in GPU Operator v26.3.0, MIG Manager generates the MIG configuration for a node at runtime from the available hardware. The configuration is generated on startup, discovering MIG profiles for each MIG-capable GPU on a node using NVIDIA Management Library (NVML), then writing it to a ConfigMap for each MIG-capable node in your cluster. The ConfigMap is named <node-name>-mig-config, where <node-name> is the name of each MIG-capable node. Each ConfigMap contains a complete mig-parted config, including all-disabled, all-enabled, per-profile configs such as all-1g.10gb, and all-balanced with device-filter support for mixed GPU types. When a new MIG-capable GPU is added to a node, the new GPU is automatically added to the ConfigMap.

If you need custom profiles, you can use a custom MIG configuration instead of the generated one. You can use the Helm chart to create a ConfigMap from values at install time, or create and reference your own ConfigMap. For an example, refer to Example: Custom MIG Configuration During Installation.

Note

Generated MIG configuration might not be available on older drivers, such as 535 branch GPU drivers, as they do not support querying MIG profiles when MIG mode is disabled. In those cases, the GPU Operator will use a static Configmap, default-mig-parted-config, for MIG profiles.

Example: Single MIG Strategy#

The following steps show how to use the single MIG strategy and configure the 1g.10gb profile on one node.

  1. Configure the MIG strategy to single if you are unsure of the current strategy:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"single"}]'
    
  2. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-1g.10gb --overwrite
    

    MIG Manager proceeds to apply a mig.config.state label to the node and terminates all the GPU pods in preparation to enable MIG mode and configure the GPU into the desired MIG geometry.

  3. Optional: Display the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Partial Output

      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-1g.10gb",
      "nvidia.com/mig.config.state": "pending",
      "nvidia.com/mig.strategy": "single"
    }
    

    When the WITH_REBOOT option is set, MIG Manager sets the label to nvidia.com/mig.config.state: rebooting.

  4. Confirm that MIG Manager completed the configuration by checking the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Check for the following labels:

    • nvidia.com/gpu.count: 7 (the value differs according to the GPU model)

    • nvidia.com/gpu.slices.ci: 1

    • nvidia.com/gpu.slices.gi: 1

    • nvidia.com/mig.config.state: success

    Partial Output

    "nvidia.com/gpu.count": "7",
    "nvidia.com/gpu.present": "true",
    "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3-MIG-1g.10gb",
    "nvidia.com/gpu.slices.ci": "1",
    "nvidia.com/gpu.slices.gi": "1",
    "nvidia.com/mig.capable": "true",
    "nvidia.com/mig.config": "all-1g.10gb",
    "nvidia.com/mig.config.state": "success",
    "nvidia.com/mig.strategy": "single"
    
  5. Optional: Run the nvidia-smi command in the driver container to verify that the MIG configuration has been applied.

    $ kubectl exec -it -n gpu-operator ds/nvidia-driver-daemonset -- nvidia-smi -L
    

    Example Output

    GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-b4895dbf-9350-2524-a89b-98161ddd9fe4)
      MIG 1g.10gb     Device  0: (UUID: MIG-3f6f389f-b0cc-5e5c-8e32-eaa8fd067902)
      MIG 1g.10gb     Device  1: (UUID: MIG-35f93699-4b53-5a19-8289-80b8418eec60)
      MIG 1g.10gb     Device  2: (UUID: MIG-9d14fb21-4ae1-546f-a636-011582899c39)
      MIG 1g.10gb     Device  3: (UUID: MIG-0f709664-740c-52b0-ae79-3e4c9ede6d3b)
      MIG 1g.10gb     Device  4: (UUID: MIG-5d23f73a-d378-50ac-a6f5-3bf5184773bb)
      MIG 1g.10gb     Device  5: (UUID: MIG-6cea15c7-8a56-578c-b965-0e73cb6dfc10)
      MIG 1g.10gb     Device  6: (UUID: MIG-981c86e9-3607-57d7-9426-295347e4b925)
    

Example: Mixed MIG Strategy#

The following steps show how to use the mixed MIG strategy and configure the all-balanced profile on one node.

  1. Configure the MIG strategy to mixed if you are unsure of the current strategy:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"mixed"}]'
    
  2. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-balanced --overwrite
    

    MIG Manager proceeds to apply a mig.config.state label to the node and terminates all the GPU pods in preparation to enable MIG mode and configure the GPU into the desired MIG geometry.

  3. Confirm that MIG Manager completed the configuration by checking the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Check for labels like the following. The profiles and GPU counts differ according to the GPU model.

    • nvidia.com/mig-1g.10gb.count: 2

    • nvidia.com/mig-2g.20gb.count: 1

    • nvidia.com/mig-3g.40gb.count: 1

    • nvidia.com/mig.config.state: success

    Partial Output

      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "0",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig-1g.10gb.count": "2",
      "nvidia.com/mig-1g.10gb.engines.copy": "1",
      "nvidia.com/mig-1g.10gb.engines.decoder": "1",
      "nvidia.com/mig-1g.10gb.engines.encoder": "0",
      "nvidia.com/mig-1g.10gb.engines.jpeg": "1",
      "nvidia.com/mig-1g.10gb.engines.ofa": "0",
      "nvidia.com/mig-1g.10gb.memory": "9984",
      "nvidia.com/mig-1g.10gb.multiprocessors": "16",
      "nvidia.com/mig-1g.10gb.product": "NVIDIA-H100-80GB-HBM3-MIG-1g.10gb",
      "nvidia.com/mig-1g.10gb.replicas": "1",
      "nvidia.com/mig-1g.10gb.sharing-strategy": "none",
      "nvidia.com/mig-1g.10gb.slices.ci": "1",
      "nvidia.com/mig-1g.10gb.slices.gi": "1",
      "nvidia.com/mig-2g.20gb.count": "1",
      "nvidia.com/mig-2g.20gb.engines.copy": "2",
      "nvidia.com/mig-2g.20gb.engines.decoder": "2",
      "nvidia.com/mig-2g.20gb.engines.encoder": "0",
      "nvidia.com/mig-2g.20gb.engines.jpeg": "2",
      "nvidia.com/mig-2g.20gb.engines.ofa": "0",
      "nvidia.com/mig-2g.20gb.memory": "20096",
      "nvidia.com/mig-2g.20gb.multiprocessors": "32",
      "nvidia.com/mig-2g.20gb.product": "NVIDIA-H100-80GB-HBM3-MIG-2g.20gb",
      "nvidia.com/mig-2g.20gb.replicas": "1",
      "nvidia.com/mig-2g.20gb.sharing-strategy": "none",
      "nvidia.com/mig-2g.20gb.slices.ci": "2",
      "nvidia.com/mig-2g.20gb.slices.gi": "2",
      "nvidia.com/mig-3g.40gb.count": "1",
      "nvidia.com/mig-3g.40gb.engines.copy": "3",
      "nvidia.com/mig-3g.40gb.engines.decoder": "3",
      "nvidia.com/mig-3g.40gb.engines.encoder": "0",
      "nvidia.com/mig-3g.40gb.engines.jpeg": "3",
      "nvidia.com/mig-3g.40gb.engines.ofa": "0",
      "nvidia.com/mig-3g.40gb.memory": "40320",
      "nvidia.com/mig-3g.40gb.multiprocessors": "60",
      "nvidia.com/mig-3g.40gb.product": "NVIDIA-H100-80GB-HBM3-MIG-3g.40gb",
      "nvidia.com/mig-3g.40gb.replicas": "1",
      "nvidia.com/mig-3g.40gb.sharing-strategy": "none",
      "nvidia.com/mig-3g.40gb.slices.ci": "3",
      "nvidia.com/mig-3g.40gb.slices.gi": "3",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-balanced",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "mixed",
      "nvidia.com/mps.capable": "false"
    }
    
  4. Optional: Run the nvidia-smi command in the driver container to verify that the GPU has been configured.

    $ kubectl exec -it -n gpu-operator ds/nvidia-driver-daemonset -- nvidia-smi -L
    

    Example Output

    GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-b4895dbf-9350-2524-a89b-98161ddd9fe4)
      MIG 3g.40gb     Device  0: (UUID: MIG-7089d0f3-293f-58c9-8f8c-5ea666eedbde)
      MIG 2g.20gb     Device  1: (UUID: MIG-56c30729-347f-5dd6-8da0-c3cc59e969e0)
      MIG 1g.10gb     Device  2: (UUID: MIG-9d14fb21-4ae1-546f-a636-011582899c39)
      MIG 1g.10gb     Device  3: (UUID: MIG-0f709664-740c-52b0-ae79-3e4c9ede6d3b)
    

Example: Reconfiguring MIG Profiles#

MIG Manager supports dynamic reconfiguration of the MIG geometry. The following steps show how to update a GPU on a node to the 3g.40gb profile with the single MIG strategy.

  1. Label the node with the profile:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-3g.40gb --overwrite
    
  2. Optional: Monitor the MIG Manager logs to confirm the new MIG geometry is applied:

    $ kubectl logs -n gpu-operator -l app=nvidia-mig-manager -c nvidia-mig-manager
    

    Example Output

    Applying the selected MIG config to the node
    time="2024-05-14T18:31:26Z" level=debug msg="Parsing config file..."
    time="2024-05-14T18:31:26Z" level=debug msg="Selecting specific MIG config..."
    time="2024-05-14T18:31:26Z" level=debug msg="Running apply-start hook"
    time="2024-05-14T18:31:26Z" level=debug msg="Checking current MIG mode..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    Asserting MIG mode: Enabled"
    time="2024-05-14T18:31:26Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-14T18:31:26Z" level=debug msg="    Current MIG mode: Enabled"
    time="2024-05-14T18:31:26Z" level=debug msg="Checking current MIG device configuration..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    Asserting MIG config: map[3g.40gb:2]"
    time="2024-05-14T18:31:26Z" level=debug msg="Running pre-apply-config hook"
    time="2024-05-14T18:31:26Z" level=debug msg="Applying MIG device configuration..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-14T18:31:26Z" level=debug msg="    Updating MIG config: map[3g.40gb:2]"
    MIG configuration applied successfully
    time="2024-05-14T18:31:27Z" level=debug msg="Running apply-exit hook"
    Restarting validator pod to re-run all validations
    pod "nvidia-operator-validator-kmncw" deleted
    Restarting all GPU clients previously shutdown in Kubernetes by reenabling their component-specific nodeSelector labels
    node/node-name labeled
    Changing the 'nvidia.com/mig.config.state' node label to 'success'
    
  3. Optional: Display the node labels to confirm the GPU count (2), slices (3), and profile are set:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Partial Output

      "nvidia.com/gpu.count": "2",
      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3-MIG-3g.40gb",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/gpu.slices.ci": "3",
      "nvidia.com/gpu.slices.gi": "3",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-3g.40gb",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "single",
      "nvidia.com/mps.capable": "false"
    }
    

Example: Custom MIG Configuration During Installation#

If you need to use custom profiles, you can create a custom ConfigMap during installation by passing in a name and data for the ConfigMap with the Helm command.

The MIG Manager daemonset is configured to use this ConfigMap instead of the auto-generated one.

In your values.yaml file, set migManager.config.create to true, set migManager.config.name, and add the ConfigMap data under migManager.config.data, for example:

  1. In your values.yaml file, add the data for the ConfigMap, like the following example:

    migManager:
      config:
        name: custom-mig-config
        create: true
        data:
          config.yaml: |-
            version: v1
            mig-configs:
              all-disabled:
                - devices: all
                  mig-enabled: false
              custom-mig:
                - devices: [0]
                  mig-enabled: false
                - devices: [1]
                  mig-enabled: true
                  mig-devices:
                    "1g.10gb": 2
                - devices: [2]
                  mig-enabled: true
                  mig-devices:
                    "2g.20gb": 2
                    "3g.40gb": 1
                - devices: [3]
                  mig-enabled: true
                  mig-devices:
                    "3g.40gb": 1
                    "4g.40gb": 1
    

Note

Custom ConfigMaps must contain a key named “config.yaml”

  1. Install or upgrade the GPU Operator with this values file so the chart creates the ConfigMap:

    $ helm upgrade --install gpu-operator -n gpu-operator --create-namespace \
        nvidia/gpu-operator --version=v26.3.0 \
        -f values.yaml
    
  2. If the custom configuration specifies more than one instance profile, set the strategy to mixed:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"mixed"}]'
    
  3. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=custom-mig --overwrite
    
  4. Optional: Monitor the MIG Manager logs to confirm the new MIG geometry is applied:

    $ kubectl logs -n gpu-operator -l app=nvidia-mig-manager -c nvidia-mig-manager
    

    Example Output

    Applying the selected MIG config to the node
    time="2024-05-15T13:40:08Z" level=debug msg="Parsing config file..."
    time="2024-05-15T13:40:08Z" level=debug msg="Selecting specific MIG config..."
    time="2024-05-15T13:40:08Z" level=debug msg="Running apply-start hook"
    time="2024-05-15T13:40:08Z" level=debug msg="Checking current MIG mode..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    Asserting MIG mode: Enabled"
    time="2024-05-15T13:40:08Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-15T13:40:08Z" level=debug msg="    Current MIG mode: Enabled"
    time="2024-05-15T13:40:08Z" level=debug msg="Checking current MIG device configuration..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    Asserting MIG config: map[1g.10gb:5 2g.20gb:1]"
    time="2024-05-15T13:40:08Z" level=debug msg="Running pre-apply-config hook"
    time="2024-05-15T13:40:08Z" level=debug msg="Applying MIG device configuration..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-15T13:40:08Z" level=debug msg="    Updating MIG config: map[1g.10gb:5 2g.20gb:1]"
    time="2024-05-15T13:40:09Z" level=debug msg="Running apply-exit hook"
    MIG configuration applied successfully
    

Example: Custom MIG Configuration#

You can create and apply a ConfigMap yourself if the default profiles do not meet your needs.

  1. Create a file, such as custom-mig-config.yaml, with contents like the following example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: custom-mig-config
    data:
      config.yaml: |
        version: v1
        mig-configs:
          all-disabled:
            - devices: all
              mig-enabled: false
          
          five-1g-one-2g:
            - devices: all 
              mig-enabled: true
              mig-devices:
                "1g.10gb": 5
                "2g.20gb": 1
    

Note

Custom ConfigMaps must contain a key named “config.yaml”

  1. Apply the manifest:

    $ kubectl apply -n gpu-operator -f custom-mig-config.yaml
    
  2. If the custom configuration specifies more than one instance profile, set the strategy to mixed:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"mixed"}]'
    
  3. Patch the cluster policy so MIG Manager uses the custom ConfigMap:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/migManager/config/name", "value":"custom-mig-config"}]'
    
  4. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=five-1g-one-2g --overwrite
    

Verification: Running Sample CUDA Workloads#

CUDA VectorAdd#

Let’s run a simple CUDA sample, in this case vectorAdd by requesting a GPU resource as you would normally do in Kubernetes. In this case, Kubernetes will schedule the pod on a single MIG device and we use a nodeSelector to direct the pod to be scheduled on the node with the MIG devices.

$ cat << EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vectoradd
spec:
  restartPolicy: OnFailure
  containers:
  - name: vectoradd
    image: nvidia/samples:vectoradd-cuda11.2.1
    resources:
      limits:
        nvidia.com/gpu: 1
  nodeSelector:
    nvidia.com/gpu.product: A100-SXM4-40GB-MIG-1g.5gb
EOF

Concurrent Job Launch#

Now, let’s try a more complex example. In this example, we will use Argo Workflows to launch concurrent jobs on MIG devices. In this example, the A100 has been configured into 2 MIG devices using the: 3g.20gb profile.

First, install the Argo Workflows components into your Kubernetes cluster.

$ kubectl create ns argo \
    && kubectl apply -n argo \
    -f https://raw.githubusercontent.com/argoproj/argo-workflows/stable/manifests/quick-start-postgres.yaml

Next, download the latest Argo CLI from the releases page and follow the instructions to install the binary.

Now, we will craft an Argo example that launches multiple CUDA containers onto the MIG devices on the GPU. We will reuse the same vectorAdd example from before. Here is the job description, saved as vector-add.yaml:

$ cat << EOF > vector-add.yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: argo-mig-example-
spec:
entrypoint: argo-mig-result-example
templates:
- name: argo-mig-result-example
    steps:
    - - name: generate
        template: gen-mig-device-list
    # Iterate over the list of numbers generated by the generate step above
    - - name: argo-mig
        template: argo-mig
        arguments:
        parameters:
        - name: argo-mig
            value: "{{item}}"
        withParam: "{{steps.generate.outputs.result}}"

# Generate a list of numbers in JSON format
- name: gen-mig-device-list
    script:
    image: python:alpine3.6
    command: [python]
    source: |
        import json
        import sys
        json.dump([i for i in range(0, 2)], sys.stdout)

- name: argo-mig
    retryStrategy:
    limit: 10
    retryPolicy: "Always"
    inputs:
    parameters:
    - name: argo-mig
    container:
    image: nvidia/samples:vectoradd-cuda11.2.1
    resources:
        limits:
        nvidia.com/gpu: 1
    nodeSelector:
    nvidia.com/gpu.product: A100-SXM4-40GB-MIG-3g.20gb
EOF

Launch the workflow:

$ argo submit -n argo --watch vector-add.yaml

Argo will print out the pods that have been launched:

Name:                argo-mig-example-z6mqd
Namespace:           argo
ServiceAccount:      default
Status:              Succeeded
Conditions:
Completed           True
Created:             Wed Mar 24 14:44:51 -0700 (20 seconds ago)
Started:             Wed Mar 24 14:44:51 -0700 (20 seconds ago)
Finished:            Wed Mar 24 14:45:11 -0700 (now)
Duration:            20 seconds
Progress:            3/3
ResourcesDuration:   9s*(1 cpu),9s*(100Mi memory),1s*(1 nvidia.com/gpu)

STEP                       TEMPLATE                 PODNAME                           DURATION  MESSAGE
✔ argo-mig-example-z6mqd  argo-mig-result-example
├───✔ generate            gen-mig-device-list      argo-mig-example-z6mqd-562792713  8s
└─┬─✔ argo-mig(0:0)(0)    argo-mig                 argo-mig-example-z6mqd-845918106  2s
└─✔ argo-mig(1:1)(0)    argo-mig                 argo-mig-example-z6mqd-870679174  2s

If you observe the logs, you can see that the vector-add sample has completed on both devices:

$ argo logs -n argo @latest
argo-mig-example-z6mqd-562792713: [0, 1]
argo-mig-example-z6mqd-870679174: [Vector addition of 50000 elements]
argo-mig-example-z6mqd-870679174: Copy input data from the host memory to the CUDA device
argo-mig-example-z6mqd-870679174: CUDA kernel launch with 196 blocks of 256 threads
argo-mig-example-z6mqd-870679174: Copy output data from the CUDA device to the host memory
argo-mig-example-z6mqd-870679174: Test PASSED
argo-mig-example-z6mqd-870679174: Done
argo-mig-example-z6mqd-845918106: [Vector addition of 50000 elements]
argo-mig-example-z6mqd-845918106: Copy input data from the host memory to the CUDA device
argo-mig-example-z6mqd-845918106: CUDA kernel launch with 196 blocks of 256 threads
argo-mig-example-z6mqd-845918106: Copy output data from the CUDA device to the host memory
argo-mig-example-z6mqd-845918106: Test PASSED
argo-mig-example-z6mqd-845918106: Done

Disabling MIG#

You can disable MIG on a node by setting the nvidia.com/mig.config label to all-disabled:

$ kubectl label nodes <node-name> nvidia.com/mig.config=all-disabled --overwrite

MIG Manager with Preinstalled Drivers#

MIG Manager supports preinstalled drivers. Information in the preceding sections still applies, however there are a few additional details to consider.

Install#

During GPU Operator installation, driver.enabled=false must be set. The following options can be used to install the GPU Operator:

$ helm install gpu-operator \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator \
    --version=v26.3.0 \
    --set driver.enabled=false

Managing Host GPU Clients#

MIG Manager stops all operator-managed pods that have access to GPUs when applying a MIG reconfiguration. When drivers are preinstalled, there can be GPU clients on the host that also need to be stopped.

When drivers are preinstalled, MIG Manager attempts to stop and restart a list of systemd services on the host across a MIG reconfiguration. The list of services is specified in the default-gpu-clients ConfigMap.

The following sample GPU clients file, clients.yaml, is used to create the default-gpu-clients ConfigMap:

version: v1
systemd-services:
  - nvsm.service
  - nvsm-mqtt.service
  - nvsm-core.service
  - nvsm-api-gateway.service
  - nvsm-notifier.service
  - nv_peer_mem.service
  - nvidia-dcgm.service
  - dcgm.service
  - dcgm-exporter.service

You can modify the list by editing the ConfigMap after installation. Alternatively, you can create a custom ConfigMap for use by MIG Manager by performing the following steps:

  1. Create the gpu-operator namespace:

    $ kubectl create namespace gpu-operator
    
  2. Create a ConfigMap containing the custom clients.yaml file with a list of GPU clients:

    $ kubectl create configmap -n gpu-operator gpu-clients --from-file=clients.yaml
    
  3. Install the GPU Operator:

    $ helm install gpu-operator \
        -n gpu-operator --create-namespace \
        nvidia/gpu-operator \
        --version=v26.3.0 \
        --set migManager.gpuClientsConfig.name=gpu-clients \
        --set driver.enabled=false
    

Architecture#

MIG Manager is designed as a controller within Kubernetes. It watches for changes to the nvidia.com/mig.config label on the node and then applies the user-requested MIG configuration. When the label changes, MIG Manager first stops all GPU pods, including device plugin, GPU feature discovery, and DCGM exporter. MIG Manager then stops all host GPU clients listed in the clients.yaml ConfigMap if drivers are preinstalled. Finally, it applies the MIG reconfiguration and restarts the GPU pods and possibly, host GPU clients. The MIG reconfiguration can also involve rebooting a node if a reboot is required to enable MIG mode.

The default MIG profiles are specified in the <node-name>-mig-config ConfigMap. This ConfigMap is auto-generated by the MIG Manager for each MIG-capable node and contains the standard MIG profiles for the available GPUs on the node. You can also configure the operator to configure a custom ConfigMap to use instead of the auto-generated one.

You can specify one of these profiles to apply to the mig.config label to trigger a reconfiguration of the MIG geometry.

MIG Manager uses the mig-parted tool to apply the configuration changes to the GPU, including enabling MIG mode, with a node reboot as required by some scenarios.

flowchart subgraph mig[MIG Manager] direction TB A[Controller] <--> B[MIG-Parted] end A -- on change --> C subgraph recon[Reconfiguration] C["Config is Pending or Rebooting"] --> D["Stop Operator Pods"] --> E["Enable MIG Mode and Reboot if Required"] --> F["Use mig-parted to Configure MIG Geometry"] --> G["Restart Operator Pods"] end H["Set mig.config label to Success"] I["Set mig.config label to Failed"] G --> H G -- on failure --> I