update

`dpsctl resource-group update` Usage Guide

Update policies of a resource group (RG).

Usage

dpsctl resource-group update

Flags

Includes global dpsctl options.

   --resource-group string              resource group name
   --policy string                      resource group policy (optional)
   --remove-policy                      remove resource group policy (optional, true/false) (default: false)
   --entity-policy value                set entity policies, each instance of the flag is PolicyName=entityName (optional, can be used multiple times)
   --remove-entity-policy string        remove entity policies (comma-separated list of entities) (optional)
   --entity-gpu-policy value            set GPU policies for entities, each instance of the flag is nodeName=<watts1>,<watts2>,... (optional, can be used multiple times)
   --remove-entity-gpu-policy string    remove GPU policies for entities (comma-separated list of entity names) (optional)
   --strict-policy                      if requested power policy is not possible, fail instead of switching to a lower policy (only useful if the resource group is already active) (default: false)
   --workload-profile-ids string        workload power profile Redfish IDs (comma separated list of profile IDs) (optional)
   --remove-all-workload-profiles       remove workload power profiles (optional, true/false) (default: false)
   --sync                               synchronous update. Wait until resource group update is complete (default: false)
   --async                              asynchronous update (default). Return immediately after validating the resource group, but update continues asynchronously (default: false)
   --async-n-hosts int                  asynchronous update, but wait until at least this many hosts are configured before returning (default: 0)
   --async-percent-hosts int            asynchronous update, but wait until at least this percent of hosts are configured before returning (default: 0)
   --async-wait duration                asynchronous update, but wait this much before returning (default: 0s)
   --at-least-n-hosts int               at least this number of hosts must be updated for the resource group update to succeed. Remaining hosts may fail to be configured, but they remain part of the resource group (default: 0)
   --at-least-percent-hosts int         at least this percent of hosts must be updated for the resource group update to succeed. Remaining hosts may fail to be configured, but they remain part of the resource group (default: 0)
   --partial-timeout duration           wait at most this much for the resource group update to come to an acceptable level. If the resource group cannot be updated for the given failure tolerance levels within this duration, update fails. (default: 0s)
   --help, -h                           show help

Examples

Updating Default Resource Group Policy (Inactive RG)

To update a resource group default policy, simply use the --policy flag.

$ dpsctl resource-group update --resource-group example1 --policy Node-Med
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Removing Default Resource Group Policy (Inactive RG)

The default resource group policy can be removed using --remove-policy.

$ dpsctl resource-group update --resource-group example1 --remove-policy=true
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Updating Resource Group Entity Policies (Inactive RG)

In addition to updating the default resource group policy, the entity-specific policies may be updated independently.

In this example, a previously created, inactive resource group example1 contains 3 nodes: node001, node002, and node003, with a default resource group policy of Node-High. However, we want node001 and node002 to use the Node-Low and Node-Med policies, respectively.

To make this update, we will make use of the --entity-policy flag. We can use this flag multiple times during the same command to specify multiple updates.

$ dpsctl resource-group update --resource-group example1 --entity-policy Node-Low=node001 --entity-policy Node-Med=node002
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Removing Resource Group Entity Policies (Inactive RG)

In the example above, we updated the inactive resource group example1 so that node001 is using the Node-Low policy and node002 is using the Node-Med policy.

In this example, we want to return node001 to the default resource group policy, Node-High. Instead of setting this policy using the --entity-policy flag, we can simply remove the policy for node001. We’ll do this using the --remove-entity-policy flag.

$ dpsctl resource-group update --resource-group example1 --remove-entity-policy node001
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Updating GPU Policies for Entities (Inactive RG)

Per-GPU power limits can be configured for entities using the --entity-gpu-policy flag. Each usage specifies a node name and a comma-separated list of GPU power limits in watts (one per GPU, in GPU index order).

$ dpsctl resource-group update --resource-group example1 --entity-gpu-policy node001=500,550,600,700,650,700,550,600
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Multiple entities can be updated in the same command:

$ dpsctl resource-group update --resource-group example1 \
  --entity-gpu-policy node001=500,550,600,700,650,700,550,600 \
  --entity-gpu-policy node002=600,600,600,600,600,600,600,600
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Removing GPU Policies for Entities (Inactive RG)

GPU policies can be removed from entities using --remove-entity-gpu-policy, which accepts a comma-separated list of entity names.

$ dpsctl resource-group update --resource-group example1 --remove-entity-gpu-policy node001,node002
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Updating an Active Resource Group with Strict Policy Application

For more information on Strict vs Auto-Switching policy activation, see dpsctl resource-group activate.

$ dpsctl resource-group update --resource-group example1 --policy Node-High --strict-policy
{
  "node_statuses": {
    "eos0205": {
      "policy_apply_status": {
        "policy": {
          "Name": "Node-High",
          "capabilities": {
            "min": 5600,
            "max": 10200
          },
          "components": [
            {
              "id": "COMP_CPU",
              "limit": 1530
            },
            {
              "id": "COMP_GPU",
              "limit": 7650
            },
            {
              "id": "COMP_MEMORY",
              "limit": 1020
            }
          ]
        },
        "status": {
          "ok": true,
          "diag_msg": "Success"
        }
      }
    }
  },
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

If the policy has already been applied, a simple success message will be provided without node_statuses.

$ dpsctl resource-group update --resource-group example1 --policy Node-High --strict-policy
{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Updating GPU Policies on an Active Resource Group

When updating GPU policies on an active resource group, the response includes per-GPU results within gpu_power_policy_results.

$ dpsctl resource-group update --resource-group example1 --entity-gpu-policy node001=500,550,600,700,650,700,550,600
{
  "node_statuses": {
    "node001": {
      "policy_apply_status": {
        "policy": {
          "Name": "Node-High",
          "capabilities": {
            "min": 5600,
            "max": 10200
          },
          "components": [
            {
              "id": "COMP_CPU",
              "limit": 1530
            },
            {
              "id": "COMP_GPU",
              "limit": 4850
            },
            {
              "id": "COMP_MEMORY",
              "limit": 1020
            }
          ]
        },
        "status": {
          "ok": true,
          "diag_msg": "Success"
        }
      },
      "gpu_power_policy_results": {
        "gpu_power_policy_result": [
          {
            "gpu_id": 0,
            "ok": true,
            "set_limit": 500.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 1,
            "ok": true,
            "set_limit": 550.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 2,
            "ok": true,
            "set_limit": 600.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 3,
            "ok": true,
            "set_limit": 700.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 4,
            "ok": true,
            "set_limit": 650.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 5,
            "ok": true,
            "set_limit": 700.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 6,
            "ok": true,
            "set_limit": 550.0,
            "diag_msg": "Success"
          },
          {
            "gpu_id": 7,
            "ok": true,
            "set_limit": 600.0,
            "diag_msg": "Success"
          }
        ],
        "status": {
          "ok": true,
          "diag_msg": "Success"
        }
      }
    }
  },
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}