SDK Simulator User Guide

The DPS SDK Simulator is a Kubernetes-based development environment that provides a complete emulated datacenter for testing power management solutions. With 144 simulated DGX GB300 nodes and comprehensive monitoring, you can develop and test DPS integrations without requiring physical hardware.

What’s Included

Emulated Infrastructure

  • 144 DGX GB300 Compute Nodes - Organized across 8 racks (18 nodes per rack)
  • Complete Power Distribution Hierarchy - Utility → Switchboard → Floor PDUs → Rack PSUs → Compute Systems
  • BMC Simulator - Provides Redfish API endpoints for all emulated nodes
  • Pre-configured Power Policies - High (5600W), Medium (3200W), and Low (1600W) settings

For detailed information about the simulator’s datacenter topology structure, see the Simulator Topology Guide.

Monitoring and Visualization

  • Grafana Dashboards - Real-time metrics for datacenter power, resource groups, and system operations
  • Prometheus - Time-series metrics collection and alerting
  • Pyroscope - Continuous profiling for performance analysis
  • Web UI - Interactive interface for managing DPS

Automated Simulation Playbooks

  • Resource Group Simulation - Automated workload lifecycle testing with configurable parameters
  • Grid Simulation - Domain-level power management and grid integration testing
  • Load Shedding Simulation - Power reduction events and recovery scenarios
  • Combined Simulations - Run multiple scenarios simultaneously for comprehensive testing

Use Cases

The simulator is ideal for:

  • Learning DPS - Explore concepts, APIs, and workflows in a safe environment
  • SDK Development - Build and test custom integrations without hardware dependencies
  • Partner Integration - Develop workload scheduler, grid, or optimization integrations
  • Quick Start - Jump-start your custom datacenter deployment

Quick Start

System Requirements:

  • Linux (Ubuntu/Debian) or macOS (limited support)
  • Minimum 8GB RAM, 20GB free disk space
  • Internet connection for dependencies
  • Access to Gitlab DPS SDK Repository

Setup Instructions:

  1. Download and unarchive the DPS SDK files from the NVIDIA NVOnline Portal

  2. Run Setup

    cd dps-sdk
    task setup  # Installs kubectl, helm, dpsctl
  3. Deploy Simulator

    task sdk    # Create cluster and deploy DPS
    task sim    # Initialize and run simulation
  4. Access Services

    • DPS API: http://api.dps.sdk.local
    • Web UI: http://ui.dps.sdk.local
    • Documentation: http://docs.dps.sdk.local
    • Grafana: http://grafana.dps.sdk.local (admin/dps)

Simulator Playbooks

Explore automated testing scenarios with detailed playbooks covering:

  • Resource group simulation and management
  • Grid-level power operations
  • Performance benchmarking scenarios
  • Custom simulation workflows