GPU Operator with MIG

About Multi-Instance GPU

Multi-Instance GPU (MIG) enables GPUs based on the NVIDIA Ampere and later architectures, such as NVIDIA A100, to be partitioned into separate and secure GPU instances for CUDA applications. Refer to the MIG User Guide for more information about MIG.

GPU Operator deploys MIG Manager to manage MIG configuration on nodes in your Kubernetes cluster.

Enabling MIG During Installation

The following steps use the single MIG strategy. Alternatively, you can specify the mixed strategy.

Perform the following steps to install the Operator and configure MIG:

  1. Install the Operator:

    $ helm install --wait --generate-name \
        -n gpu-operator --create-namespace \
        nvidia/gpu-operator \
        --set mig.strategy=single
    

    Set mig.strategy to mixed when MIG mode is not enabled on all GPUs on a node.

    In a CSP environment such as Google Cloud, also specify --set migManager.env[0].name=WITH_REBOOT --set-string migManager.env[0].value=true to ensure that the node reboots and can apply the MIG configuration.

    MIG Manager supports preinstalled drivers. If drivers are preinstalled, also specify --set driver.enabled=false. Refer to MIG Manager with Preinstalled Drivers for more details.

    After several minutes, all the pods, including the nvidia-mig-manager are deployed on nodes that have MIG capable GPUs.

  2. Optional: Display the pods in the Operator namespace:

    $ kubectl get pods -n gpu-operator
    

    Example Output

    NAME                                                          READY   STATUS      RESTARTS   AGE
    gpu-feature-discovery-qmwb2                                   1/1     Running     0          14m
    gpu-operator-7bbf8bb6b7-xz664                                 1/1     Running     0          14m
    gpu-operator-node-feature-discovery-gc-79d6d968bb-sg4t6       1/1     Running     0          14m
    gpu-operator-node-feature-discovery-master-6d9f8d497c-7cwrp   1/1     Running     0          14m
    gpu-operator-node-feature-discovery-worker-x5z62              1/1     Running     0          14m
    nvidia-container-toolkit-daemonset-pkcpr                      1/1     Running     0          14m
    nvidia-cuda-validator-wt6bc                                   0/1     Completed   0          12m
    nvidia-dcgm-exporter-zsskv                                    1/1     Running     0          14m
    nvidia-device-plugin-daemonset-924x6                          1/1     Running     0          14m
    nvidia-driver-daemonset-klj5s                                 1/1     Running     0          14m
    nvidia-mig-manager-8d6wz                                      1/1     Running     0          12m
    nvidia-operator-validator-fnsmk                               1/1     Running     0          14m
    
  3. Optional: Display the labels applied to the node:

    $ kubectl get node -o json | jq '.items[].metadata.labels'
    

    Partial Output

      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-disabled",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "single",
      "nvidia.com/mps.capable": "false"
    }
    

    Important

    MIG Manager requires that no user workloads are running on the GPUs being configured. In some cases, the node may need to be rebooted, such as a CSP, so the node might need to be cordoned before changing the MIG mode or the MIG geometry on the GPUs.

Configuring MIG Profiles

By default, nodes are labeled with nvidia.com/mig.config: all-disabled and you must specify the MIG configuration to apply.

MIG Manager uses the default-mig-parted-config config map in the GPU Operator namespace to identify supported MIG profiles. Refer to the config map when you label the node or customize the config map.

Example: Single MIG Strategy

The following steps show how to use the single MIG strategy and configure the 1g.10gb profile on one node.

  1. Configure the MIG strategy to single if you are unsure of the current strategy:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"single"}]'
    
  2. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-1g.10gb --overwrite
    

    MIG Manager proceeds to apply a mig.config.state label to the node and terminates all the GPU pods in preparation to enable MIG mode and configure the GPU into the desired MIG geometry.

  3. Optional: Display the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Partial Output

      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-1g.10gb",
      "nvidia.com/mig.config.state": "pending",
      "nvidia.com/mig.strategy": "single"
    }
    

    As described above, if the WITH_REBOOT option is set then MIG Manager sets the label to nvidia.com/mig.config.state: rebooting.

  4. Confirm that MIG Manager completed the configuration by checking the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Check for the following labels:

    • nvidia.com/gpu.count: 7, this value differs according to the GPU model.

    • nvidia.com/gpu.slices.ci: 1

    • nvidia.com/gpu.slices.gi: 1

    • nvidia.com/mig.config.state: success

    Partial Output

    "nvidia.com/gpu.count": "7",
    "nvidia.com/gpu.present": "true",
    "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3-MIG-1g.10gb",
    "nvidia.com/gpu.slices.ci": "1",
    "nvidia.com/gpu.slices.gi": "1",
    "nvidia.com/mig.capable": "true",
    "nvidia.com/mig.config": "all-1g.10gb",
    "nvidia.com/mig.config.state": "success",
    "nvidia.com/mig.strategy": "single"
    
  5. Optional: Run the nvidia-smi command in the driver container to verify that the MIG configuration:

    $ kubectl exec -it -n gpu-operator ds/nvidia-driver-daemonset -- nvidia-smi -L
    

    Example Output

    GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-b4895dbf-9350-2524-a89b-98161ddd9fe4)
      MIG 1g.10gb     Device  0: (UUID: MIG-3f6f389f-b0cc-5e5c-8e32-eaa8fd067902)
      MIG 1g.10gb     Device  1: (UUID: MIG-35f93699-4b53-5a19-8289-80b8418eec60)
      MIG 1g.10gb     Device  2: (UUID: MIG-9d14fb21-4ae1-546f-a636-011582899c39)
      MIG 1g.10gb     Device  3: (UUID: MIG-0f709664-740c-52b0-ae79-3e4c9ede6d3b)
      MIG 1g.10gb     Device  4: (UUID: MIG-5d23f73a-d378-50ac-a6f5-3bf5184773bb)
      MIG 1g.10gb     Device  5: (UUID: MIG-6cea15c7-8a56-578c-b965-0e73cb6dfc10)
      MIG 1g.10gb     Device  6: (UUID: MIG-981c86e9-3607-57d7-9426-295347e4b925)
    

Example: Mixed MIG Strategy

The following steps show how to use the mixed MIG strategy and configure the all-balanced profile on one node.

  1. Configure the MIG strategy to mixed if you are unsure of the current strategy:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"mixed"}]'
    
  2. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-balanced --overwrite
    

    MIG Manager proceeds to apply a mig.config.state label to the node and terminates all the GPU pods in preparation to enable MIG mode and configure the GPU into the desired MIG geometry.

  3. Confirm that MIG Manager completed the configuration by checking the node labels:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Check for labels like the following. The profiles and GPU counts differ according to the GPU model.

    • nvidia.com/mig-1g.10gb.count: 2

    • nvidia.com/mig-2g.20gb.count: 1

    • nvidia.com/mig-3g.40gb.count: 1

    • nvidia.com/mig.config.state: success

    Partial Output

      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3",
      "nvidia.com/gpu.replicas": "0",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/mig-1g.10gb.count": "2",
      "nvidia.com/mig-1g.10gb.engines.copy": "1",
      "nvidia.com/mig-1g.10gb.engines.decoder": "1",
      "nvidia.com/mig-1g.10gb.engines.encoder": "0",
      "nvidia.com/mig-1g.10gb.engines.jpeg": "1",
      "nvidia.com/mig-1g.10gb.engines.ofa": "0",
      "nvidia.com/mig-1g.10gb.memory": "9984",
      "nvidia.com/mig-1g.10gb.multiprocessors": "16",
      "nvidia.com/mig-1g.10gb.product": "NVIDIA-H100-80GB-HBM3-MIG-1g.10gb",
      "nvidia.com/mig-1g.10gb.replicas": "1",
      "nvidia.com/mig-1g.10gb.sharing-strategy": "none",
      "nvidia.com/mig-1g.10gb.slices.ci": "1",
      "nvidia.com/mig-1g.10gb.slices.gi": "1",
      "nvidia.com/mig-2g.20gb.count": "1",
      "nvidia.com/mig-2g.20gb.engines.copy": "2",
      "nvidia.com/mig-2g.20gb.engines.decoder": "2",
      "nvidia.com/mig-2g.20gb.engines.encoder": "0",
      "nvidia.com/mig-2g.20gb.engines.jpeg": "2",
      "nvidia.com/mig-2g.20gb.engines.ofa": "0",
      "nvidia.com/mig-2g.20gb.memory": "20096",
      "nvidia.com/mig-2g.20gb.multiprocessors": "32",
      "nvidia.com/mig-2g.20gb.product": "NVIDIA-H100-80GB-HBM3-MIG-2g.20gb",
      "nvidia.com/mig-2g.20gb.replicas": "1",
      "nvidia.com/mig-2g.20gb.sharing-strategy": "none",
      "nvidia.com/mig-2g.20gb.slices.ci": "2",
      "nvidia.com/mig-2g.20gb.slices.gi": "2",
      "nvidia.com/mig-3g.40gb.count": "1",
      "nvidia.com/mig-3g.40gb.engines.copy": "3",
      "nvidia.com/mig-3g.40gb.engines.decoder": "3",
      "nvidia.com/mig-3g.40gb.engines.encoder": "0",
      "nvidia.com/mig-3g.40gb.engines.jpeg": "3",
      "nvidia.com/mig-3g.40gb.engines.ofa": "0",
      "nvidia.com/mig-3g.40gb.memory": "40320",
      "nvidia.com/mig-3g.40gb.multiprocessors": "60",
      "nvidia.com/mig-3g.40gb.product": "NVIDIA-H100-80GB-HBM3-MIG-3g.40gb",
      "nvidia.com/mig-3g.40gb.replicas": "1",
      "nvidia.com/mig-3g.40gb.sharing-strategy": "none",
      "nvidia.com/mig-3g.40gb.slices.ci": "3",
      "nvidia.com/mig-3g.40gb.slices.gi": "3",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-balanced",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "mixed",
      "nvidia.com/mps.capable": "false"
    }
    
  4. Optional: Run the nvidia-smi command in the driver container to verify that the GPU has been configured:

    $ kubectl exec -it -n gpu-operator ds/nvidia-driver-daemonset -- nvidia-smi -L
    

    Example Output

    GPU 0: NVIDIA H100 80GB HBM3 (UUID: GPU-b4895dbf-9350-2524-a89b-98161ddd9fe4)
      MIG 3g.40gb     Device  0: (UUID: MIG-7089d0f3-293f-58c9-8f8c-5ea666eedbde)
      MIG 2g.20gb     Device  1: (UUID: MIG-56c30729-347f-5dd6-8da0-c3cc59e969e0)
      MIG 1g.10gb     Device  2: (UUID: MIG-9d14fb21-4ae1-546f-a636-011582899c39)
      MIG 1g.10gb     Device  3: (UUID: MIG-0f709664-740c-52b0-ae79-3e4c9ede6d3b)
    

Example: Reconfiguring MIG Profiles

MIG Manager supports dynamic reconfiguration of the MIG geometry. The following steps show how to update a GPU on a node to the 3g.40gb profile with the single MIG strategy.

  1. Label the node with the profile:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=all-3g.40gb --overwrite
    
  2. Optional: Monitor the MIG Manager logs to confirm the new MIG geometry is applied:

    $ kubectl logs -n gpu-operator -l app=nvidia-mig-manager -c nvidia-mig-manager
    

    Example Output

    Applying the selected MIG config to the node
    time="2024-05-14T18:31:26Z" level=debug msg="Parsing config file..."
    time="2024-05-14T18:31:26Z" level=debug msg="Selecting specific MIG config..."
    time="2024-05-14T18:31:26Z" level=debug msg="Running apply-start hook"
    time="2024-05-14T18:31:26Z" level=debug msg="Checking current MIG mode..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    Asserting MIG mode: Enabled"
    time="2024-05-14T18:31:26Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-14T18:31:26Z" level=debug msg="    Current MIG mode: Enabled"
    time="2024-05-14T18:31:26Z" level=debug msg="Checking current MIG device configuration..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    Asserting MIG config: map[3g.40gb:2]"
    time="2024-05-14T18:31:26Z" level=debug msg="Running pre-apply-config hook"
    time="2024-05-14T18:31:26Z" level=debug msg="Applying MIG device configuration..."
    time="2024-05-14T18:31:26Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-14T18:31:26Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-14T18:31:26Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-14T18:31:26Z" level=debug msg="    Updating MIG config: map[3g.40gb:2]"
    MIG configuration applied successfully
    time="2024-05-14T18:31:27Z" level=debug msg="Running apply-exit hook"
    Restarting validator pod to re-run all validations
    pod "nvidia-operator-validator-kmncw" deleted
    Restarting all GPU clients previously shutdown in Kubernetes by reenabling their component-specific nodeSelector labels
    node/node-name labeled
    Changing the 'nvidia.com/mig.config.state' node label to 'success'
    
  3. Optional: Display the node labels to confirm the GPU count (2), slices (3), and profile are set:

    $ kubectl get node <node-name> -o=jsonpath='{.metadata.labels}' | jq .
    

    Partial Output

      "nvidia.com/gpu.count": "2",
      "nvidia.com/gpu.present": "true",
      "nvidia.com/gpu.product": "NVIDIA-H100-80GB-HBM3-MIG-3g.40gb",
      "nvidia.com/gpu.replicas": "1",
      "nvidia.com/gpu.sharing-strategy": "none",
      "nvidia.com/gpu.slices.ci": "3",
      "nvidia.com/gpu.slices.gi": "3",
      "nvidia.com/mig.capable": "true",
      "nvidia.com/mig.config": "all-3g.40gb",
      "nvidia.com/mig.config.state": "success",
      "nvidia.com/mig.strategy": "single",
      "nvidia.com/mps.capable": "false"
    }
    

Example: Custom MIG Configuration

By default, the Operator creates the default-mig-parted-config config map and MIG Manager is configured to read profiles from that config map.

You can create a config map with a custom configuration if the default profiles do not meet your business needs.

  1. Create a file, such as custom-mig-config.yaml, with contents like the following example:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: custom-mig-config
    data:
      config.yaml: |
        version: v1
        mig-configs:
          all-disabled:
            - devices: all
              mig-enabled: false
          
          five-1g-one-2g:
            - devices: all 
              mig-enabled: true
              mig-devices:
                "1g.10gb": 5
                "2g.20gb": 1
    
  2. Apply the manifest:

    $ kubectl apply -n gpu-operator -f custom-mig-config.yaml
    
  3. If the custom configuration specifies more than one instance profile, set the strategy to mixed:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/mig/strategy", "value":"mixed"}]'
    
  4. Patch the cluster policy so MIG Manager uses the custom config map:

    $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \
        --type='json' \
        -p='[{"op":"replace", "path":"/spec/migManager/config/name", "value":"custom-mig-config"}]'
    
  5. Label the nodes with the profile to configure:

    $ kubectl label nodes <node-name> nvidia.com/mig.config=five-1g-one-2g --overwrite
    
  6. Optional: Monitor the MIG Manager logs to confirm the new MIG geometry is applied:

    $ kubectl logs -n gpu-operator -l app=nvidia-mig-manager -c nvidia-mig-manager
    

    Example Output

    Applying the selected MIG config to the node
    time="2024-05-15T13:40:08Z" level=debug msg="Parsing config file..."
    time="2024-05-15T13:40:08Z" level=debug msg="Selecting specific MIG config..."
    time="2024-05-15T13:40:08Z" level=debug msg="Running apply-start hook"
    time="2024-05-15T13:40:08Z" level=debug msg="Checking current MIG mode..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    Asserting MIG mode: Enabled"
    time="2024-05-15T13:40:08Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-15T13:40:08Z" level=debug msg="    Current MIG mode: Enabled"
    time="2024-05-15T13:40:08Z" level=debug msg="Checking current MIG device configuration..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    Asserting MIG config: map[1g.10gb:5 2g.20gb:1]"
    time="2024-05-15T13:40:08Z" level=debug msg="Running pre-apply-config hook"
    time="2024-05-15T13:40:08Z" level=debug msg="Applying MIG device configuration..."
    time="2024-05-15T13:40:08Z" level=debug msg="Walking MigConfig for (devices=all)"
    time="2024-05-15T13:40:08Z" level=debug msg="  GPU 0: 0x233010DE"
    time="2024-05-15T13:40:08Z" level=debug msg="    MIG capable: true\n"
    time="2024-05-15T13:40:08Z" level=debug msg="    Updating MIG config: map[1g.10gb:5 2g.20gb:1]"
    time="2024-05-15T13:40:09Z" level=debug msg="Running apply-exit hook"
    MIG configuration applied successfully
    

Verification: Running Sample CUDA Workloads

CUDA VectorAdd

Let’s run a simple CUDA sample, in this case vectorAdd by requesting a GPU resource as you would normally do in Kubernetes. In this case, Kubernetes will schedule the pod on a single MIG device and we use a nodeSelector to direct the pod to be scheduled on the node with the MIG devices.

$ cat << EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: cuda-vectoradd
spec:
  restartPolicy: OnFailure
  containers:
  - name: vectoradd
    image: nvidia/samples:vectoradd-cuda11.2.1
    resources:
      limits:
        nvidia.com/gpu: 1
  nodeSelector:
    nvidia.com/gpu.product: A100-SXM4-40GB-MIG-1g.5gb
EOF

Concurrent Job Launch

Now, let’s try a more complex example. In this example, we will use Argo Workflows to launch concurrent jobs on MIG devices. In this example, the A100 has been configured into 2 MIG devices using the: 3g.20gb profile.

First, install the Argo Workflows components into your Kubernetes cluster.

$ kubectl create ns argo \
    && kubectl apply -n argo \
    -f https://raw.githubusercontent.com/argoproj/argo-workflows/stable/manifests/quick-start-postgres.yaml

Next, download the latest Argo CLI from the releases page and follow the instructions to install the binary.

Now, we will craft an Argo example that launches multiple CUDA containers onto the MIG devices on the GPU. We will reuse the same vectorAdd example from before. Here is the job description, saved as vector-add.yaml:

$ cat << EOF > vector-add.yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: argo-mig-example-
spec:
entrypoint: argo-mig-result-example
templates:
- name: argo-mig-result-example
    steps:
    - - name: generate
        template: gen-mig-device-list
    # Iterate over the list of numbers generated by the generate step above
    - - name: argo-mig
        template: argo-mig
        arguments:
        parameters:
        - name: argo-mig
            value: "{{item}}"
        withParam: "{{steps.generate.outputs.result}}"

# Generate a list of numbers in JSON format
- name: gen-mig-device-list
    script:
    image: python:alpine3.6
    command: [python]
    source: |
        import json
        import sys
        json.dump([i for i in range(0, 2)], sys.stdout)

- name: argo-mig
    retryStrategy:
    limit: 10
    retryPolicy: "Always"
    inputs:
    parameters:
    - name: argo-mig
    container:
    image: nvidia/samples:vectoradd-cuda11.2.1
    resources:
        limits:
        nvidia.com/gpu: 1
    nodeSelector:
    nvidia.com/gpu.product: A100-SXM4-40GB-MIG-3g.20gb
EOF

Launch the workflow:

$ argo submit -n argo --watch vector-add.yaml

Argo will print out the pods that have been launched:

Name:                argo-mig-example-z6mqd
Namespace:           argo
ServiceAccount:      default
Status:              Succeeded
Conditions:
Completed           True
Created:             Wed Mar 24 14:44:51 -0700 (20 seconds ago)
Started:             Wed Mar 24 14:44:51 -0700 (20 seconds ago)
Finished:            Wed Mar 24 14:45:11 -0700 (now)
Duration:            20 seconds
Progress:            3/3
ResourcesDuration:   9s*(1 cpu),9s*(100Mi memory),1s*(1 nvidia.com/gpu)

STEP                       TEMPLATE                 PODNAME                           DURATION  MESSAGE
✔ argo-mig-example-z6mqd  argo-mig-result-example
├───✔ generate            gen-mig-device-list      argo-mig-example-z6mqd-562792713  8s
└─┬─✔ argo-mig(0:0)(0)    argo-mig                 argo-mig-example-z6mqd-845918106  2s
└─✔ argo-mig(1:1)(0)    argo-mig                 argo-mig-example-z6mqd-870679174  2s

If you observe the logs, you can see that the vector-add sample has completed on both devices:

$ argo logs -n argo @latest
argo-mig-example-z6mqd-562792713: [0, 1]
argo-mig-example-z6mqd-870679174: [Vector addition of 50000 elements]
argo-mig-example-z6mqd-870679174: Copy input data from the host memory to the CUDA device
argo-mig-example-z6mqd-870679174: CUDA kernel launch with 196 blocks of 256 threads
argo-mig-example-z6mqd-870679174: Copy output data from the CUDA device to the host memory
argo-mig-example-z6mqd-870679174: Test PASSED
argo-mig-example-z6mqd-870679174: Done
argo-mig-example-z6mqd-845918106: [Vector addition of 50000 elements]
argo-mig-example-z6mqd-845918106: Copy input data from the host memory to the CUDA device
argo-mig-example-z6mqd-845918106: CUDA kernel launch with 196 blocks of 256 threads
argo-mig-example-z6mqd-845918106: Copy output data from the CUDA device to the host memory
argo-mig-example-z6mqd-845918106: Test PASSED
argo-mig-example-z6mqd-845918106: Done

Disabling MIG

You can disable MIG on a node by setting the nvidia.con/mig.config label to all-disabled: .. code-block:: console

$ kubectl label nodes <node-name> nvidia.com/mig.config=all-disabled –overwrite

MIG Manager with Preinstalled Drivers

MIG Manager supports preinstalled drivers. Information in the preceding sections still applies, however there are a few additional details to consider.

Install

During GPU Operator installation, driver.enabled=false must be set. The following options can be used to install the GPU Operator:

$ helm install gpu-operator \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator \
    --set driver.enabled=false

Managing Host GPU Clients

MIG Manager stops all operator-managed pods that have access to GPUs when applying a MIG reconfiguration. When drivers are preinstalled, there may be GPU clients on the host that also need to be stopped.

When drivers are preinstalled, MIG Manager attempts to stop and restart a list of systemd services on the host across a MIG reconfiguration. The list of services are specified in the default-gpu-clients config map.

The following sample GPU clients file, clients.yaml, is used to create the default-gpu-clients config map:

version: v1
systemd-services:
  - nvsm.service
  - nvsm-mqtt.service
  - nvsm-core.service
  - nvsm-api-gateway.service
  - nvsm-notifier.service
  - nv_peer_mem.service
  - nvidia-dcgm.service
  - dcgm.service
  - dcgm-exporter.service

You can modify the list by editing the config map after installation. Alternatively, you can create a custom config map for use by MIG Manager by performing the following steps:

  1. Create the gpu-operator namespace:

    $ kubectl create namespace gpu-operator
    
  2. Create a ConfigMap containing the custom clients.yaml file with a list of GPU clients:

    $ kubectl create configmap -n gpu-operator gpu-clients --from-file=clients.yaml
    
  3. Install the GPU Operator:

    $ helm install gpu-operator \
        -n gpu-operator --create-namespace \
        nvidia/gpu-operator \
        --set migManager.gpuClientsConfig.name=gpu-clients
        --set driver.enabled=false
    

Architecture

MIG Manager is designed as a controller within Kubernetes. It watches for changes to the nvidia.com/mig.config label on the node and then applies the user-requested MIG configuration When the label changes, MIG Manager first stops all GPU pods, including device plugin, GPU feature discovery, and DCGM exporter. MIG Manager then stops all host GPU clients listed in the clients.yaml config map if drivers are preinstalled. Finally, it applies the MIG reconfiguration and restarts the GPU pods and possibly, host GPU clients. The MIG reconfiguration can also involve rebooting a node if a reboot is required to enable MIG mode.

The default MIG profiles are specified in the default-mig-parted-config config map. You can specify one of these profiles to apply to the mig.config label to trigger a reconfiguration of the MIG geometry.

MIG Manager uses the mig-parted tool to apply the configuration changes to the GPU, including enabling MIG mode, with a node reboot as required by some scenarios.

flowchart subgraph mig[MIG Manager] direction TB A[Controller] <--> B[MIG-Parted] end A -- on change --> C subgraph recon[Reconfiguration] C["Config is Pending or Rebooting"] --> D["Stop Operator Pods"] --> E["Enable MIG Mode and Reboot if Required"] --> F["Use mig-parted to Configure MIG Geometry"] --> G["Restart Operator Pods"] end H["Set mig.config label to Success"] I["Set mig.config label to Failed"] G --> H G -- on failure --> I