Managing Topologies
Managing Topologies
Overview
This guide provides step-by-step instructions for performing common topology management activities.
For more information about topologies, see Topologies.
Prerequisites
- DPS server running and accessible
dpsctlinstalled and authenticated- Device specifications already defined (see Managing Devices)
Step 1: Prepare Topology JSON File
Topologies are defined in JSON format. Each file can include entities, topology structure, and policies.
Example Structure
{
"Entities": [
{
"Type": "PowerDomain",
"Name": "PD-A",
"Constraints": {
"PowerValue": {"Value": 1150000, "Type": "W"},
"PowerFactor": 0.9
}
},
{
"Type": "ComputerSystem",
"Model": "DGX_H100",
"Name": "node001",
"Policy": "Node-High",
"Redfish": {
"@odata.type": "#ComputerSystem.v1_23_0.ComputerSystem",
"@odata.id": "/node001",
"Id": "node001",
"URL": "https://node001-bmc.example.com",
"SecretName": "node001"
}
}
],
"Topology": {
"Name": "pdn",
"Entities": [
{
"Name": "PD-A",
"Children": ["node001"]
},
{
"Name": "node001"
}
]
},
"Policies": [
{
"Name": "Node-High",
"Limits": [
{
"ElementType": "Node",
"PowerLimit": {"Watts": 10200}
}
]
}
]
}Step 2: Validate Topology File
Before importing, validate your topology file to catch errors early.
dpsctl topology validate --filename topology.jsonExpected Output:
{
"status": {
"ok": true,
"diag_msg": "Topology validation passed"
}
}If there are errors, they will be listed in the output. Fix them before proceeding.
Step 3: Import Topology
Import the validated topology into DPS:
dpsctl topology import --filename topology.jsonExpected Output:
{
"status": {
"ok": true,
"diag_msg": "Success"
}
}Step 4: List Topologies
To see all available topologies:
dpsctl topology listExample Output:
{
"topologies": [
{
"topology_name": "pdn",
"leaf_node_names": ["node001"]
}
]
}Step 5: Activate a Topology
Only one topology can be active at a time. Activation applies power policies and enables management.
dpsctl topology activate --topology pdnFlags:
-
--replace-topologyReplace the currently active topology -
--ping-hostsTest connectivity of all hosts (default: true) -
--at-least-percent-hostsMinimum percent of hosts that must be reachable (default: 50)
Example:
dpsctl topology activate --topology pdn --ping-hosts --at-least-percent-hosts 80Step 6: Deactivate a Topology
Deactivate the current topology (required before deleting or replacing):
dpsctl topology deactivate --topology pdnStep 7: Update an Existing Topology
To update a topology (e.g., after editing the JSON file):
dpsctl topology update --filename updated-topology.json- Use
--forceto bypass certain validation checks (use with caution).
Step 8: Remove a Topology
To permanently delete a topology (must be deactivated first):
dpsctl topology remove --topology pdnStep 9: List and Manage Entities
List all entities or specific ones:
dpsctl topology list-entities
dpsctl topology list-entities node001Remove entities (must not be in an active topology):
dpsctl topology remove-entities node001 node002Troubleshooting & Validation
- Use
dpsctl topology validateto check for schema, reference, and logical errors. - Common validation errors include:
- Invalid or duplicate names
- Missing required fields
- Circular dependencies in topology structure
- References to non-existent entities
Next Steps
After activating a topology:
- Manage Resource Groups - Create workload-specific power allocations
- Integrate with SLURM - Set up automated job scheduling integration
- Monitor Power Usage - Track power consumption and policy compliance