Managing Topologies

Managing Topologies

Overview

This guide provides step-by-step instructions for performing common topology management activities.

For more information about topologies, see Topologies.

Prerequisites

  • DPS server running and accessible
  • dpsctl installed and authenticated
  • Device specifications already defined (see Managing Devices)

Step 1: Prepare Topology JSON File

Topologies are defined in JSON format. Each file can include entities, topology structure, and policies.

Example Structure

{
  "Entities": [
    {
      "Type": "PowerDomain",
      "Name": "PD-A",
      "Constraints": {
        "PowerValue": {"Value": 1150000, "Type": "W"},
        "PowerFactor": 0.9
      }
    },
    {
      "Type": "ComputerSystem",
      "Model": "DGX_H100",
      "Name": "node001",
      "Policy": "Node-High",
      "Redfish": {
        "@odata.type": "#ComputerSystem.v1_23_0.ComputerSystem",
        "@odata.id": "/node001",
        "Id": "node001",
        "URL": "https://node001-bmc.example.com",
        "SecretName": "node001"
      }
    }
  ],
  "Topology": {
    "Name": "pdn",
    "Entities": [
      {
        "Name": "PD-A",
        "Children": ["node001"]
      },
      {
        "Name": "node001"
      }
    ]
  },
  "Policies": [
    {
      "Name": "Node-High",
      "Limits": [
        {
          "ElementType": "Node",
          "PowerLimit": {"Watts": 10200}
        }
      ]
    }
  ]
}

Step 2: Validate Topology File

Before importing, validate your topology file to catch errors early.

dpsctl topology validate --filename topology.json

Expected Output:

{
  "status": {
    "ok": true,
    "diag_msg": "Topology validation passed"
  }
}

If there are errors, they will be listed in the output. Fix them before proceeding.

Step 3: Import Topology

Import the validated topology into DPS:

dpsctl topology import --filename topology.json

Expected Output:

{
  "status": {
    "ok": true,
    "diag_msg": "Success"
  }
}

Step 4: List Topologies

To see all available topologies:

dpsctl topology list

Example Output:

{
  "topologies": [
    {
      "topology_name": "pdn",
      "leaf_node_names": ["node001"]
    }
  ]
}

Step 5: Activate a Topology

Only one topology can be active at a time. Activation applies power policies and enables management.

dpsctl topology activate --topology pdn

Flags:

  • --replace-topology Replace the currently active topology

  • --ping-hosts Test connectivity of all hosts (default: true)

  • --at-least-percent-hosts Minimum percent of hosts that must be reachable (default: 50)

Example:

dpsctl topology activate --topology pdn --ping-hosts --at-least-percent-hosts 80

Step 6: Deactivate a Topology

Deactivate the current topology (required before deleting or replacing):

dpsctl topology deactivate --topology pdn

Step 7: Update an Existing Topology

To update a topology (e.g., after editing the JSON file):

dpsctl topology update --filename updated-topology.json
  • Use --force to bypass certain validation checks (use with caution).

Step 8: Remove a Topology

To permanently delete a topology (must be deactivated first):

dpsctl topology remove --topology pdn

Step 9: List and Manage Entities

List all entities or specific ones:

dpsctl topology list-entities
dpsctl topology list-entities node001

Remove entities (must not be in an active topology):

dpsctl topology remove-entities node001 node002

Troubleshooting & Validation

  • Use dpsctl topology validate to check for schema, reference, and logical errors.
  • Common validation errors include:
    • Invalid or duplicate names
    • Missing required fields
    • Circular dependencies in topology structure
    • References to non-existent entities

Next Steps

After activating a topology:

  1. Manage Resource Groups - Create workload-specific power allocations
  2. Integrate with SLURM - Set up automated job scheduling integration
  3. Monitor Power Usage - Track power consumption and policy compliance