Managing Devices

Managing Devices

Overview

This guide provides step-by-step instructions for creating device specifications in DPS. Device specifications define the power characteristics and management capabilities of datacenter equipment and must be created before defining entities and topologies.

For more information about device specifications, see Device Specifications.

Prerequisites

  • DPS server running and accessible
  • dpsctl installed and authenticated
  • Understanding of your hardware’s power characteristics

Step 1: Create Device Specification YAML File

Device specifications are defined in YAML format. Create a file (e.g., devices.yaml) with your hardware specifications.

Basic Structure

- type: DeviceType
  model: DeviceModel  # Optional for some types
  spec:
    # Power characteristics
    minLoadWatts: 100
    maxLoadWatts: 1000
    efficiencyFactor: 0.95
    # Management integration
    powerPolicyPlugin: "PluginName"  # e.g., DGX_H100, DGX_B200

Example: Complete Device Registry

# Power infrastructure devices
- type: PowerDomain
  spec:
    efficiencyFactor: 0.95

- type: PowerDistribution
  model: RackPDU95_57500W
  spec:
    maxLoadWatts: 57500
    efficiencyFactor: 0.95

- type: PowerSupply
  model: PowerSupply95_3300W
  spec:
    maxLoadWatts: 3300
    efficiencyFactor: 0.95

# Compute devices
- type: ComputerSystem
  description: NVIDIA GB200 Compute Tray (Bianca)
  model: DGX_GB200
  spec:
    devices:
      - type: CPU
        model: Grace
        count: 2
      - type: GPU
        model: GB200
        count: 4
    minLoadWatts: 900
    maxLoadWatts: 5640
    processorModulesCount: 2

- type: GPU
  description: NVIDIA GB200 GPU
  model: GB200
  spec:
    minLoadWatts: 200
    maxLoadWatts: 1200
    wppsSupport: true

- type: CPU
  description: NVIDIA Grace CPU
  model: Grace
  spec:
    minLoadWatts: 50
    maxLoadWatts: 420

Step 2: Import Device Specifications

Use dpsctl device upsert to import your device specifications:

# Import device specifications
dpsctl device upsert devices.yaml

Expected Output:

{
  "status": {
    "diag_msg": "Success",
    "ok": true
  }
}

Step 3: Verify Import

List all registered devices to confirm successful import:

# List all device specifications
dpsctl device list

Expected Output:

[
  {
    "type": "PowerDomain",
    "description": "Generic power domain",
    "Spec": {
      "PowerDomain": {
        "efficiency_factor": 0.95
      }
    }
  },
  {
    "type": "ComputerSystem",
    "model": "DGX_GB200",
    "description": "NVIDIA GB200 Compute Tray (Bianca)",
    "Spec": {
      "ComputerSystem": {
        "devices": [
          {
            "type": "CPU",
            "model": "Grace",
            "count": 2
          },
          {
            "type": "GPU",
            "model": "GB200",
            "count": 4
          }
        ],
        "min_load_watts": 900,
        "max_load_watts": 5640,
        "processor_modules_count": 2
      }
    }
  }
]

Step 4: Update Existing Devices

To modify existing device specifications, edit your YAML file and run the same import command:

# Update existing devices (upsert operation)
dpsctl device upsert devices.yaml

The upsert operation will:

  • Insert new devices that don’t exist
  • Update existing devices with the same type and model

Troubleshooting

Common Errors

Invalid YAML Syntax:

Error: yaml: line 5: mapping values are not allowed in this context
  • Solution: Check YAML indentation and syntax

Missing Required Fields:

Error: device specification missing required field 'spec'
  • Solution: Ensure all devices have a spec section

Invalid Power Values:

Error: maxLoadWatts must be greater than minLoadWatts
  • Solution: Verify power value relationships are correct

Validation

Before importing, validate your YAML file:

# Check YAML syntax
yaml-validator devices.yaml

# Test import in dry-run mode (if available)
dpsctl device upsert devices.yaml --dry-run

Next Steps

After creating device specifications:

  1. Manage Topologies - Connect entities together
  2. Manage Resource Groups - Set up workload power management