resourcegroup
API Reference: v1/resourcegroup.proto
Resource Groups manage power policies for temporary workload allocations on datacenter hardware. They provide ephemeral power management that overrides topology defaults during job execution.
A resource group is a collection of compute resources (nodes) allocated for a specific workload, such as a SLURM job or machine learning training run. Resource groups temporarily override the topology-specified power policies of their entities during workload execution, then restore topology defaults when the workload completes.
Resource groups follow a 5-step lifecycle:
- CREATE - Create an empty, inactive resource group with optional default power policy
- ADD - Add compute resources (nodes) to the group before activation
- ACTIVATE - Apply power policies to hardware and mark the group as active
- UPDATE (Optional) - Dynamically adjust power policies during workload execution
- DELETE - Deactivate and cleanup, restoring topology defaults
Power Policy Hierarchy
Resource groups use a 3-level policy hierarchy to determine effective power settings:
- Entity-level policy - Specific policy for individual nodes (highest priority)
- Resource group policy - Default policy for all nodes in the group
- Topology policy - Baseline policy from the datacenter topology (lowest priority)
Dynamic Power Management
Resource groups support dynamic power management (DPM) through:
- Power Reservation Steering (PRS) - Automatic power redistribution based on telemetry
- Policy Updates - Runtime policy adjustments for optimization
- GPU Workload Profiles - Hardware-specific power optimization for GPU workloads
Integration with Workload Schedulers
Resource groups are designed to integrate with workload schedulers like SLURM:
- Use external IDs to map to scheduler job IDs (e.g., SLURM_JOB_ID)
- Follow scheduler lifecycle events (job start/end)
- Support scheduler-driven power policy updates
Power policies are defined in policy.proto, and topology entities are defined in topology.proto. Workload-specific optimizations use telemetry data structures from common.proto.
Table of Contents
-
Services
-
Messages
- ActivateResourceGroupRequest
- ActivateResourceGroupResponse
- ActivateResourceGroupResponse.NodeStatusesEntry
- DeactivateResourceGroupRequest
- DeactivateResourceGroupResponse
- GPUWorkloadProfileResponse
- GPUWorkloadProfileResponses
- NodeStatusResponse
- ResourceGroupAddResourcesRequest
- ResourceGroupAddResourcesResponse
- ResourceGroupAsyncOperationStatus
- ResourceGroupAsyncOperationStatusRequest
- ResourceGroupAsyncOperationStatusRequest.ResourceGroupStatusInfo
- ResourceGroupAsyncStrategy
- ResourceGroupCreateRequest
- ResourceGroupCreateResponse
- ResourceGroupDeleteRequest
- ResourceGroupDeleteResponse
- ResourceGroupListAllRequest
- ResourceGroupListAllResponse
- ResourceGroupListAllResponse.ResourceGroupInfo
- ResourceGroupListAllResponse.ResourceGroupInfo.ResourcePolicy
- ResourceGroupPartialActivation
- ResourceGroupRemoveResourcesRequest
- ResourceGroupRemoveResourcesResponse
- ResourceGroupUpdateRequest
- ResourceGroupUpdateResourcesRequest
- ResourceGroupUpdateResourcesRequest.ResourcePolicy
- ResourceGroupUpdateResourcesResponse
- ResourceGroupUpdateResourcesResponse.NodeStatusesEntry
- ResourceGroupUpdateResponse
- ResourceGroupUpdateResponse.NodeStatusesEntry
- UpdateGPUPoliciesRequest
- UpdateGPUPoliciesRequest.NodeGpuPoliciesEntry
- UpdateGPUPoliciesResponse
- UpdateGPUPoliciesResponse.Result
- WorkloadProfileIDs
Services
ResourceGroupManagementService
ResourceGroupManagementService manages ephemeral power policy allocations for workloads
This service provides APIs to create, manage, and monitor resource groups - collections of compute resources with customized power policies for specific workloads. Resource groups temporarily override topology-specified power policies during workload execution, enabling dynamic power management optimized for specific computational tasks.
The service integrates with workload schedulers like SLURM to provide power management throughout the workload lifecycle. It supports both static policy assignment and dynamic policy updates based on real-time telemetry data.
Resource groups must be created from an active topology. The topology defines the available entities and baseline power policies that resource groups can override.
ResourceGroupCreate
rpc ResourceGroupCreate(ResourceGroupCreateRequest) ResourceGroupCreateResponse
Create a new empty and inactive resource group.
This is the first step in the resource group lifecycle. The created resource group is initially empty (no resources) and inactive (no policies applied to hardware). Resources must be added and the group activated before power policies take effect.
ResourceGroupDelete
rpc ResourceGroupDelete(ResourceGroupDeleteRequest) ResourceGroupDeleteResponse
Deactivate and delete a given resource group
This permanently removes the resource group and restores all associated hardware resources to their topology-specified power policies. If the resource group is active, it is automatically deactivated before deletion. This is typically called when a workload completes.
ResourceGroupList
rpc ResourceGroupList(ResourceGroupListAllRequest) ResourceGroupListAllResponse
List all available resource groups
Returns comprehensive information about all resource groups in the system, including their activation status, assigned resources, and power policies. This is used for monitoring, administration, and troubleshooting resource group state.
ResourceGroupAddResources
rpc ResourceGroupAddResources(ResourceGroupAddResourcesRequest) ResourceGroupAddResourcesResponse
Add resources to a given resource group. The resource group must be inactive.
This assigns compute resources (nodes) to the resource group before activation. Resources can be assigned group-level or entity-specific power policies. The resource group must be inactive - resources cannot be added to active resource groups.
ResourceGroupRemoveResources
rpc ResourceGroupRemoveResources(ResourceGroupRemoveResourcesRequest) ResourceGroupRemoveResourcesResponse
Remove resources from a given resource group. The resource group must be inactive.
This removes compute resources from the resource group before activation. Used to adjust resource allocation based on workload requirements. The resource group must be inactive - resources cannot be removed from active resource groups.
ActivateResourceGroup
rpc ActivateResourceGroup(ActivateResourceGroupRequest) ActivateResourceGroupResponse
Activate a given resource group. The resource group must be inactive.
This applies the resource group’s power policies to the assigned hardware resources and marks the group as active. The activation process validates power allocation constraints and applies policies to the hardware. Once active, the resource group manages power policies for its assigned resources until deactivation.
ResourceGroupUpdate
rpc ResourceGroupUpdate(ResourceGroupUpdateRequest) ResourceGroupUpdateResponse
Update a resource group policy. The new policy affects all entities in the resource group that does not have entity-level policies set. If the resource group is active, the policy is validated and applied immediately.
This changes the default power policy for the entire resource group. Entities with entity-specific policies are not affected. If the resource group is active, the new policy is immediately validated and applied to hardware. This enables dynamic power optimization during workload execution.
ResourceGroupUpdateResources
rpc ResourceGroupUpdateResources(ResourceGroupUpdateResourcesRequest) ResourceGroupUpdateResourcesResponse
Update policies for some of the resources in the resource group. If the resource group is active, the changes are applied immediately. This API may perform partial updates.
This updates entity-specific power policies for selected resources in the group. If the resource group is active, changes are immediately applied to hardware. The API supports partial updates - some policy changes may succeed while others fail, allowing for selective power optimization.
UpdateGPUPolicies
rpc UpdateGPUPolicies(UpdateGPUPoliciesRequest) UpdateGPUPoliciesResponse
UpdateGPUPolicies allows updating GPU policies without specifying the resource group. Because of this, a single call may end up updating multiple resource groups. If the resource groups are active, the changes are applied immediately.
This provides direct GPU power management without requiring knowledge of resource group assignments. It’s designed for telemetry-driven power optimization where external monitoring systems can adjust GPU power limits based on real-time performance data. A single call may affect multiple resource groups if GPUs span multiple groups.
AsyncOperationStatus
rpc AsyncOperationStatus(ResourceGroupAsyncOperationStatusRequest) ResourceGroupAsyncOperationStatus
Queries the server about the status of an asynchronous operation started by one of the service APIs
Resource group operations (especially activation) can be performed asynchronously for better scalability. This API allows clients to check the progress and results of these asynchronous operations. The operation results are available until explicitly queried and forgotten.
Messages
ActivateResourceGroupRequest
ActivateResourceGroupRequest is used by ResourceGroupManagementService.ActivateResourceGroup.
Validates and applies power policies to all hardware resources in the resource group. Resource group must be inactive with resources added. After activation, workload can start.
Examples: dpsctl resource-group activate –resource-group “rg_$SLURM_JOB_ID” –sync # Synchronous activation dpsctl resource-group activate –resource-group “rg_$SLURM_JOB_ID” # Asynchronous activation dpsctl resource-group activate –resource-group “rg_$SLURM_JOB_ID” –partial-n-hosts 8 # Succeed if ≥8 hosts activate
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to activate |
| strict | bool | If true, and if the given resource group cannot be activated because of power limits, do not try reducing the resource group policy. If false, the server will try a lower power policy if the given one fails. |
| async | ResourceGroupAsyncStrategy | Asynchronous activation strategy. If set, activation is asynchronous. If not set, activation is synchronous. |
| partial_activation | ResourceGroupPartialActivation | Options for partial activation with failure tolerance. Allows the resource group to be deemed active even if some hosts fail, as long as at least the specified number/fraction of hosts can be activated. |
| allow_reprovision | bool | When activating a resource and this flag is set to true, a resource group that doesn’t have sufficient power to activate will lower the power policy of other resource groups in order to be able to activate. Setting this flag to false will result in failure to activate if there isn’t sufficient power to activate the resource group. |
ActivateResourceGroupResponse
ActivateResourceGroupResponse is returned if resource group activation was successful.
| Field | Type | Description |
|---|---|---|
| node_statuses | map ActivateResourceGroupResponse.NodeStatusesEntry | Node statuses is a map where key is entity name and value is a struct with policy apply status and workload profile results |
| status | Status | Operation status |
| operation_id | string | The operation id for asynchronous operations, if the status is “async” |
ActivateResourceGroupResponse.NodeStatusesEntry
| Field | Type | Description |
|---|---|---|
| key | string | none |
| value | NodeStatusResponse | none |
DeactivateResourceGroupRequest
DeactivateResourceGroupRequest is used to deactivate a resource group and remove applied policies from physical entities.
Removes power policies from hardware resources and returns them to topology defaults. Typically called automatically during resource group deletion. Resource group must be active. After deactivation, resources return to topology defaults.
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to deactivate |
| wpps_disable_async_verification | bool | If true, workload power profile service (WPPS) operations will use synchronous verification instead of asynchronous. Default is false (asynchronous verification enabled). |
DeactivateResourceGroupResponse
Empty response from deactivating a resource group
| Field | Type | Description |
|---|---|---|
| status | Status | Operation status |
GPUWorkloadProfileResponse
GPUWorkloadProfileResponse describes the result of setting workload profile on one GPU
| Field | Type | Description |
|---|---|---|
| gpu_id | string | The GPU id within the node |
| ok | bool | Whether or not the operation was successful |
| enforced_workload_profile_ids | repeated int32 | The actual workload profiles set |
| diag_msg | string | Diagnostic msg, if any |
GPUWorkloadProfileResponses
GPUWorkloadProfileResponse describes the result of setting workload profile on one GPU
| Field | Type | Description |
|---|---|---|
| workload_profile_result | repeated GPUWorkloadProfileResponse | Operation status |
| status | Status | Operation status |
NodeStatusResponse
NodeStatusResponse combines the node status and workload profile results
| Field | Type | Description |
|---|---|---|
| policy_apply_status | PolicyApplyStatus | Struct with an activation status and an actual policy |
| workload_profile_results | GPUWorkloadProfileResponses | Result of the per-GPU workload profile operation |
ResourceGroupAddResourcesRequest
ResourceGroupAddResourcesRequest is used by ResourceGroupManagementService.ResourceGroupAddResources.
Adds compute resources (nodes) to an existing inactive resource group before activation. Resource group must be inactive. Resources cannot be added to active resource groups.
Examples: dpsctl resource-group add –resource-group “rg_$SLURM_JOB_ID” –entities “node001,node002” –policy “Node-High” # Add nodes with policy
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to add resource entities to |
| resource_names | repeated string | Names of resources to add to the resource group |
| oneof _policy_name.policy_name | optional string | Optional policy to add to each resource added, no policy defaults to resource group policy |
ResourceGroupAddResourcesResponse
ResourceGroupAddResourcesResponse is returned if adding resources to a resource group was successful.
| Field | Type | Description |
|---|---|---|
| status | Status | Operation status |
ResourceGroupAsyncOperationStatus
ResourceGroupAsyncOperationStatus returns the status of an asynchronous resource group operation. After it returns that the operation was completed, the results are forgotten.
| Field | Type | Description |
|---|---|---|
| operation_id | string | Internal ID of the asynchronous operation |
| completed | bool | Completed flag |
| n_hosts | uint32 | Number of hosts in the request |
| n_success | uint32 | Number of requests completed successfull |
| n_failed | uint32 | Number of requests failed |
| n_in_progress | uint32 | Number of requsts in progress |
| oneof status.activate | ActivateResourceGroupResponse | Activate resource group status object |
| oneof status.update | ResourceGroupUpdateResponse | Update resource group status object |
ResourceGroupAsyncOperationStatusRequest
ResourceGroupAsyncOperationStatusRequest is used to query the status of an asynchronous operation.
Checks the progress and result of asynchronous resource group operations like activation or updates. Returns operation progress including success count, failure count, and completion status.
| Field | Type | Description |
|---|---|---|
| oneof id.operation_id | string | Operation ID returned from async resource group operation |
| oneof id.resource_group_info | ResourceGroupAsyncOperationStatusRequest.ResourceGroupStatusInfo | Resource group information |
ResourceGroupAsyncOperationStatusRequest.ResourceGroupStatusInfo
Resource group information
| Field | Type | Description |
|---|---|---|
| resource_group_name | string | Resource group name |
| operation_type | string | Operation type, i.e. “activate” or “update” |
ResourceGroupAsyncStrategy
ResourceGroupAsyncStrategy specifies the asynchronous resource group activation strategy
| Field | Type | Description |
|---|---|---|
| oneof Options.nHosts | uint32 | Wait until the operation is complete for nHosts, then return. The operation continues asynchronously. |
| oneof Options.fracHosts | double | Wait until the operation is completed for the given fraction of hosts, then return. The operation continues asynchronously |
| oneof Options.wait | google.protobuf.Duration | wait this long before returning. The operation continues asynchronously. If 0, the operation returns immediately. |
ResourceGroupCreateRequest
ResourceGroupCreateRequest is used by the ResourceGroupManagementService.ResourceGroupCreate.
Creates a new, empty, and inactive resource group for managing power policies during workload execution. A resource group must be identified by a unique name, such as slurm job id. When created, the resource group does not have any resources and it is not active.
Examples: dpsctl resource-group create –resource-group “rg_$SLURM_JOB_ID” –external-id “$SLURM_JOB_ID” –policy “Node-Med”
| Field | Type | Description |
|---|---|---|
| external_id | int64 | External ID (e.g. SLURM Job ID) - Unique identifier from external workload scheduler (e.g. SLURM_JOB_ID) |
| group_name | string | Unique resource group name - Human-readable identifier, must be unique (e.g. “rg_12345”, “job12345”) |
| oneof _policy_name.policy_name | optional string | Optional policy for the whole resource group. If no policy is given, entities will use the topology policy. The policy with this name must already exist in the topology. |
| workload_profile_ids | repeated int32 | Array of requested workload profile IDs associated with the resource group |
| oneof _prs_enabled.prs_enabled | optional bool | Optional bool flag for the resource group to enable or disable prs dynamic power management default is enabled |
| properties | google.protobuf.Struct | Properties of the resource group |
| dpm_enable | bool | Boolean flag to enable all dynamic power management, resource groups with dpm_enable set to false will follow strict policy management, if enough power for the selected policy is not available, activation will fail, allocated power will not be dynamically adjusted at any time during the lifetime of the resource group. default is true (i.e. enable dynamic power management) |
ResourceGroupCreateResponse
ResourceGroupCreateResponse is returned if the resource group is created successfully.
| Field | Type | Description |
|---|---|---|
| status | Status | Operation status |
ResourceGroupDeleteRequest
ResourceGroupDeleteRequest is used by ResourceGroupManagementService.ResourceGroupDelete.
Deactivates and deletes a resource group, returning all hardware resources to their topology defaults. The resource group is automatically deactivated if active before deletion.
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to delete |
| wpps_disable_async_verification | bool | If true, workload power profile service (WPPS) operations will use synchronous verification instead of asynchronous. Default is false (asynchronous verification enabled). |
ResourceGroupDeleteResponse
ResourceGroupDeleteResponse is returned if the resource group deletion was successful.
| Field | Type | Description |
|---|---|---|
| status | Status | Operation status |
ResourceGroupListAllRequest
ResourceGroupListAllRequest is used to list resource groups.
Queries all resource groups in the system with optional filtering by activation status. Used for monitoring, administration, and troubleshooting resource group state. Returns comprehensive information including policies, resources, and status for each resource group.
| Field | Type | Description |
|---|---|---|
| oneof _list_active_only.list_active_only | optional bool | Set to true to filter by active resource groups only |
ResourceGroupListAllResponse
Response containing all (filtered) resource group information
| Field | Type | Description |
|---|---|---|
| status | Status | none |
| resource_groups | repeated ResourceGroupListAllResponse.ResourceGroupInfo | List of all resource groups |
ResourceGroupListAllResponse.ResourceGroupInfo
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group |
| external_id | int64 | External ID of resource group |
| activation_status | string | Activation status of the resource group |
| oneof _policy_name.policy_name | optional string | Optional default policy for resource group |
| resource_names | repeated string | Names of resources in resource group |
| oneof _workload_profile_ids.workload_profile_ids | optional WorkloadProfileIDs | Resource group workload profile ids |
| resource_policies | repeated ResourceGroupListAllResponse.ResourceGroupInfo.ResourcePolicy | List of policy/entity pairs |
| properties | google.protobuf.Struct | Properties of the resource group |
| oneof _prs_enabled.prs_enabled | optional bool | Optional bool flag for the resource group to enable or disable prs dynamic power management default is enabled |
| dpm_enable | bool | Boolean flag for enabling dynamic power management for the resource group default is true (dpm enabled) |
ResourceGroupListAllResponse.ResourceGroupInfo.ResourcePolicy
Define policy for a resource
| Field | Type | Description |
|---|---|---|
| resource_name | string | Name of the resource |
| policy_name | string | Name of the policy |
| oneof _applied_policy_name.applied_policy_name | optional string | Optional name of applied policy when DPM is enabled |
ResourceGroupPartialActivation
ResourceGroupPartialActivation specifies the parameters for acceptable level of failure during host configuration
| Field | Type | Description |
|---|---|---|
| oneof Options.atleast_n_hosts | uint32 | At least this many hosts must be activated for the resource group activation to be successful |
| oneof Options.atleast_frac_hosts | double | At least this fraction of hosts must be activated for the resource group activation to be successful |
| host_activation_timeout | google.protobuf.Duration | Host activation timeout. If a host is not accessible after this duration, host is deemed inaccessible |
ResourceGroupRemoveResourcesRequest
ResourceGroupRemoveResourcesRequest is used by ResourceGroupManagementService.ResourceGroupRemoveResources.
Removes compute resources (nodes) from an existing inactive resource group before activation. Used to adjust resource allocation before the workload starts. Resource group must be inactive. Resources cannot be removed from active resource groups.
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of the resource group to remove resource entities from |
| resource_names | repeated string | Names of resource entities to remove from the resource group |
ResourceGroupRemoveResourcesResponse
ResourceGroupRemoveResourcesResponse is returned if entity removal was successful.
| Field | Type | Description |
|---|---|---|
| status | Status | Operation status |
ResourceGroupUpdateRequest
ResourceGroupUpdateRequest is used by ResourceGroupManagementService.ResourceGroupUpdate to modify the resource group level policy setting.
Updates the default power policy for the entire resource group. Can be used on active or inactive resource groups. If resource group is active, changes are applied immediately to hardware.
Examples: dpsctl resource-group update –resource-group “rg_$SLURM_JOB_ID” –policy “Node-High” –sync # Synchronous update
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to update |
| oneof _policy_name.policy_name | optional string | Name of policy to update. Null policy removes the policy assignment from resource group. Modifying the resource group level policy setting updates the policies of all entities that do not have entity-level policies. If the policy assignment was removed, those entities will revert back to topology-specified policies. |
| strict | bool | If the resource group is already active and if strict is true, and if the given resource group cannot be activated because of power limits, do not try reducing the resource group policy. If strict is false, the server will try a lower power policy if the given one fails. |
| workload_profile_ids | WorkloadProfileIDs | Array of new requested workload profile IDs |
| async | ResourceGroupAsyncStrategy | Asynchronous activation strategy. If set, activation is asynchronous. If not set, activation is synchronous. |
| partial_activation | ResourceGroupPartialActivation | Options for partial activation with failure tolerance |
| wpps_disable_async_verification | bool | If true, workload power profile service (WPPS) operations will use synchronous verification instead of asynchronous. Default is false (asynchronous verification enabled). |
ResourceGroupUpdateResourcesRequest
ResourceGroupUpdateResourcesRequest is used by ResourceGroupManagementService.ResourceGroupUpdateResources.
Updates power policies for individual entities (nodes) within the resource group. If resource group is active, changes are applied immediately to hardware.
| Field | Type | Description |
|---|---|---|
| group_name | string | Name of resource group to update |
| updates | repeated ResourceGroupUpdateResourcesRequest.ResourcePolicy | List of policy/resource pairs to be updated |
| async | ResourceGroupAsyncStrategy | Asynchronous activation strategy. If set, activation is asynchronous. If not set, activation is synchronous. |
| partial_activation | ResourceGroupPartialActivation | Options for partial activation with failure tolerance |
ResourceGroupUpdateResourcesRequest.ResourcePolicy
Specifies the resource and the policy to apply to that resource. To remove a policy from a resource, don’t include optional policy
| Field | Type | Description |
|---|---|---|
| resource_name | string | Name of the resource |
| oneof PolicyUpdate.policy_name | string | Set the entity-level policy for the resource. The entity-level policy overrides the resource-group level or topology-level policy. To remove entity level policy assignment, use empty policy name |
ResourceGroupUpdateResourcesResponse
Empty response from updating resources for a resource group
| Field | Type | Description |
|---|---|---|
| node_statuses | map ResourceGroupUpdateResourcesResponse.NodeStatusesEntry | Node statuses is a map where key is entity name and value is a struct with policy apply status and workload profile results |
| status | Status | Operation status |
| operation_id | string | The operation id for asynchronous operations, if the status is “async” |
ResourceGroupUpdateResourcesResponse.NodeStatusesEntry
| Field | Type | Description |
|---|---|---|
| key | string | none |
| value | NodeStatusResponse | none |
ResourceGroupUpdateResponse
ResourceGroupUpdateResponse is returned if resource group update was successful.
| Field | Type | Description |
|---|---|---|
| node_statuses | map ResourceGroupUpdateResponse.NodeStatusesEntry | Node statuses is a map where key is entity name and value is a struct with policy apply status and workload profile results |
| status | Status | Operation status |
| operation_id | string | The operation id for asynchronous operations, if the status is “async” |
ResourceGroupUpdateResponse.NodeStatusesEntry
| Field | Type | Description |
|---|---|---|
| key | string | none |
| value | NodeStatusResponse | none |
UpdateGPUPoliciesRequest
UpdateGPUPoliciesRequest is used by ResourceGroupManagementService.UpdateGPUPolicies to update GPU power policies without a reference to the resource group.
Updates individual GPU power limits based on real-time telemetry data from external monitoring systems. Used for dynamic power optimization during workload execution without knowing resource group details. All GPUs in a node must be specified or the update fails. Updates node-level policy to satisfy aggregate GPU requirements.
| Field | Type | Description |
|---|---|---|
| node_gpu_policies | map UpdateGPUPoliciesRequest.NodeGpuPoliciesEntry | A map of node name -> GPU Policies |
UpdateGPUPoliciesRequest.NodeGpuPoliciesEntry
| Field | Type | Description |
|---|---|---|
| key | string | none |
| value | GPUPolicies | none |
UpdateGPUPoliciesResponse
UpdateGPUPoliciesResponse describes the result of each update operation
| Field | Type | Description |
|---|---|---|
| results | repeated UpdateGPUPoliciesResponse.Result | Results of the per-GPU update operation |
| status | Status | Operation status |
UpdateGPUPoliciesResponse.Result
Result describes the result of setting the power limit of one GPU
| Field | Type | Description |
|---|---|---|
| resource_name | string | The node name containing this GPU |
| gpu_id | uint32 | The GPU id within the node |
| ok | bool | Whether or not the operation was successful |
| set_limit | double | The actual limit set |
| diag_msg | string | Diagnostic msg, if any |
WorkloadProfileIDs
WorkloadProfileIDs is a wrapper around an array of workload profile IDs
| Field | Type | Description |
|---|---|---|
| ids | repeated int32 | none |