dpsctl
Overview
dpsctl is the command-line interface for DPS that provides comprehensive management and monitoring capabilities. It serves as the primary tool for administrators, operators, and users to interact with the DPS system from the command-line.
Command Categories
dpsctl organizes functionality into logical command groups:
Infrastructure Management
- topology - Import, activate, and manage datacenter topologies
- import - Import entities from external sources (BCM, Nautobot)
- device - Manage device specifications and hardware capabilities
- policy - Create and manage power policies
Workload Management
- resource-group - Create, activate, and manage workload power allocation
- gpu-policy - Set per-GPU power limits for fine-grained control
Authentication & Access
- login - Authenticate with DPS server and manage user sessions
- verify - Check DPS deployment status and connectivity
Monitoring & Diagnostics
- check - Diagnostic operations, connectivity tests, and health monitoring
- server-version - Get DPS server version and build information
- task - Monitor ongoing async tasks (e.g., activation/deactivation operations)
Usage
dpsctl can be installed as a native binary or run directly from the published nvcr.io/nvidia/dpsctl container image — see Installing dpsctl for both options.
Basic dpsctl operations:
# Authenticate with DPS
dpsctl login --username alice
# Import datacenter topology
dpsctl topology import datacenter.json
# Create workload resource group
dpsctl resource-group create --resource-group "ml-job" --policy "Node-High"
# Check system health
dpsctl check connectionFurther Reading
- dpsctl Installation - Install and configure the CLI
- User Accounts - Authentication for interactive users
- Automation Accounts - Authentication for programmatic access
- dpsctl Command Reference - Complete command documentation