Redfish API

Redfish API

Overview

Redfish is a standard RESTful API specification developed by the Distributed Management Task Force (DMTF) for managing and monitoring server hardware, storage, networking, and converged infrastructure. It provides a standardized way to interact with Baseboard Management Controllers (BMCs) and other hardware management interfaces.

DPS uses Redfish as its primary interface for communicating with NVIDIA DGX compute nodes. Through Redfish, DPS can:

  • Set and monitor power limits for nodes, GPUs, CPUs, and memory
  • Collect power consumption metrics and performance telemetry
  • Manage Workload Power Profile Settings(WPPS) for supported platforms

Supported Platforms

DPS supports the following NVIDIA DGX systems via Redfish API:

DGX H100/H200

DGX B200

DGX GB200 NVL

  • Power Management: EnvironmentMetrics (per-component power limits)
  • WPPS Support: Yes
  • Minimum BMC Version: 25.0.0

DGX B300 / B300 NVL / GB300 NVL

  • Power Management: EnvironmentMetrics (per-component power limits)
  • WPPS Support: Yes
  • Minimum BMC Version: 25.0.0

Session Management

DPS creates authenticated sessions with BMC for all subsequent API calls. Sessions are maintained as long-lived connections and reused for efficiency.

For detailed session management specifications, see DMTF Redfish Specification DSP0266 v1.23.0 - Section 9.2.4 Session Management.

Create Session

  • Endpoint: POST /redfish/v1/SessionService/Sessions
  • Arguments (request payload):
    • UserName (string) - BMC username from Kubernetes secret
    • Password (string) - BMC password from Kubernetes secret
  • Attributes (response headers):
    • X-Auth-Token - Session token for subsequent API requests
    • Location - Session URL for deletion

Example:

POST /redfish/v1/SessionService/Sessions
Content-Type: application/json

{
  "UserName": "admin",
  "Password": "password123"
}
HTTP/1.1 201 Created
X-Auth-Token: abc123-session-token-xyz789
Location: /redfish/v1/SessionService/Sessions/1

Example Error Response (Invalid Credentials):

HTTP/1.1 401 Unauthorized
Content-Type: application/json

{
  "error": {
    "code": "Base.1.13.0.SecurityAccessDenied",
    "message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access.",
    "@Message.ExtendedInfo": [
      {
        "@odata.type": "#Message.v1_1_1.Message",
        "MessageId": "Base.1.13.0.SecurityAccessDenied",
        "Message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access.",
        "Severity": "Critical",
        "Resolution": "Attempt to ensure that the URI is correct and that the service has the appropriate credentials."
      }
    ]
  }
}

Delete Session

  • Endpoint: DELETE /redfish/v1/SessionService/Sessions/{session_id}
  • Path Parameters:
    • {session_id} - Session ID from Location header
  • Request Headers:
    • X-Auth-Token - Session token from create response

Example:

DELETE /redfish/v1/SessionService/Sessions/1
X-Auth-Token: abc123-session-token-xyz789
HTTP/1.1 204 No Content

DGX H100/H200 Systems

Reference: NVIDIA DGX H100/H200 Redfish API Guide

DGX H100/H200 systems use the Node Manager API for power management through domains and policies. DPS creates a custom domain named dps-managed-domain to manage power allocation.

Node Manager Domains

Domains represent power management scopes. DPS manages power through a dedicated domain.

List Domains

  • Endpoint: GET /redfish/v1/Managers/BMC/NodeManager/Domains
  • Attributes (response):
    • Members[].@odata.id - URLs to individual domain resources

Example:

GET /redfish/v1/Managers/BMC/NodeManager/Domains
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains",
  "Members": [
    { "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/0" },
    { "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1" }
  ],
  "Members@odata.count": 2
}

Get Domain

  • Endpoint: GET /redfish/v1/Managers/BMC/NodeManager/Domains/{id}
  • Attributes (response):
    • Id - Domain identifier for update/delete operations
    • Name - Domain name; "dps-managed-domain" indicates managed domain
    • Capabilities.Max - Maximum power capability in Watts
    • Capabilities.Min - Minimum power capability in Watts
    • Policies.Members[] - Nested policy objects

Example:

GET /redfish/v1/Managers/BMC/NodeManager/Domains/1
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1",
  "Id": "1",
  "Name": "dps-managed-domain",
  "Status": { "State": "Enabled" },
  "Capabilities": {
    "Max": 10200,
    "Min": 2500
  },
  "Policies": {
    "Members": [
      {
        "ComponentId": "COMP_GPU",
        "Limit": 5600.0,
        "PercentageOfDomainBudget": 75.0,
        "Status": { "State": "Selected" }
      },
      {
        "ComponentId": "COMP_CPU",
        "Limit": 700.0,
        "PercentageOfDomainBudget": 10.0,
        "Status": { "State": "Selected" }
      }
    ]
  }
}

Create Domain

  • Endpoint: POST /redfish/v1/Managers/BMC/NodeManager/Domains
  • Arguments (request payload):
    • Id - Domain identifier (default: "0")
    • Name - Always "dps-managed-domain" (identifies managed domains)
    • Status.State - Always "Enabled" (required for active state)
    • Capabilities.Max - Node maximum power limit in Watts
    • Capabilities.Min - Node minimum power limit in Watts
    • Policies.Members[] - Array of policy objects (see Policy Object Fields)

Example:

POST /redfish/v1/Managers/BMC/NodeManager/Domains
Content-Type: application/json

{
  "Id": "1",
  "Name": "dps-managed-domain",
  "Status": { "State": "Enabled" },
  "Capabilities": {
    "Max": 10200,
    "Min": 2500
  },
  "Policies": {
    "Members": [
      {
        "ComponentId": "COMP_GPU",
        "Limit": 5600.0,
        "PercentageOfDomainBudget": 75.0,
        "Status": { "State": "Selected" }
      },
      {
        "ComponentId": "COMP_CPU",
        "Limit": 700.0,
        "PercentageOfDomainBudget": 10.0,
        "Status": { "State": "Selected" }
      },
      {
        "ComponentId": "COMP_MEMORY",
        "Limit": 200.0,
        "PercentageOfDomainBudget": 5.0,
        "Status": { "State": "Selected" }
      }
    ]
  }
}
HTTP/1.1 204 No Content

Update Domain

  • Endpoint: PATCH /redfish/v1/Managers/BMC/NodeManager/Domains/{id}
  • Arguments (request payload):
    • Capabilities.Max - Maximum power limit in Watts
    • Capabilities.Min - Minimum power limit in Watts
    • Policies.Members[] - Array of policy objects (see Policy Object Fields)

Example:

PATCH /redfish/v1/Managers/BMC/NodeManager/Domains/1
Content-Type: application/json

{
  "Capabilities": {
    "Max": 9500,
    "Min": 2000
  },
  "Policies": {
    "Members": [
      {
        "ComponentId": "COMP_GPU",
        "Limit": 5000.0,
        "PercentageOfDomainBudget": 70.0,
        "Status": { "State": "Selected" }
      },
      {
        "ComponentId": "COMP_CPU",
        "Limit": 600.0,
        "PercentageOfDomainBudget": 10.0,
        "Status": { "State": "Selected" }
      },
      {
        "ComponentId": "COMP_MEMORY",
        "Limit": 180.0,
        "PercentageOfDomainBudget": 5.0,
        "Status": { "State": "Selected" }
      }
    ]
  }
}
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@Message.ExtendedInfo": [
    {
      "@odata.type": "#Message.v1_1_1.Message",
      "Message": "The request completed successfully.",
      "MessageId": "Base.1.18.1.Success",
      "Severity": "OK",
      "Resolution": "None."
    }
  ]
}

Delete Domain

  • Endpoint: DELETE /redfish/v1/Managers/BMC/NodeManager/Domains/{id}
  • Path Parameters:
    • {id} - Domain identifier

Example:

DELETE /redfish/v1/Managers/BMC/NodeManager/Domains/1
HTTP/1.1 204 No Content

Domain Policies

Policies define power limits for specific components within a domain. Each policy object in Policies.Members[] contains the following fields:

Policy Object Fields (arguments when creating/updating)

  • ComponentId (string) - Component type identifier (exact string required):
    • "COMP_CPU" - CPU component
    • "COMP_MEMORY" - Memory/DRAM component
    • "COMP_GPU" - GPU component
  • Limit (float) - Power limit in Watts
  • PercentageOfDomainBudget (float) - Percentage of total domain budget (0.0-100.0)
  • Status.State (string) - Always "Selected" (required for active policies)

Get Policy

  • Endpoint: GET /redfish/v1/Managers/BMC/NodeManager/Domains/{domain_id}/Policies/{policy_id}
  • Path Parameters:
    • {domain_id} - Domain identifier
    • {policy_id} - Policy identifier
  • Attributes (response):
    • ComponentId - Component type ("COMP_CPU", "COMP_MEMORY", "COMP_GPU")
    • Limit - Power limit in Watts
    • PercentageOfDomainBudget - Budget percentage allocation

Example:

GET /redfish/v1/Managers/BMC/NodeManager/Domains/1/Policies/0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1/Policies/0",
  "Id": "0",
  "ComponentId": "COMP_GPU",
  "Limit": 5600.0,
  "PercentageOfDomainBudget": 75.0,
  "Status": { "State": "Selected" }
}

PSU Policies

PSU policies define power limits based on PSU redundancy configuration. DPS validates that requested power caps don’t exceed the active PSU policy’s LimitMax.

List PSU Policies

  • Endpoint: GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies

Example:

GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies",
  "Members": [
    { "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/0" },
    { "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/1" },
    { "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2" }
  ],
  "Members@odata.count": 3
}

Get PSU Policy

  • Endpoint: GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/{id}
  • Path Parameter Values:
    • "0" - Limp mode (minimal PSU configuration)
    • "1" - No Redundancy
    • "2" - Full Redundancy
  • Attributes (response):
    • LimitMax - Maximum power limit in Watts; power caps must not exceed this value

Example (Full Redundancy):

GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2",
  "Id": "2",
  "Name": "Full Redundancy",
  "LimitMax": 12000,
  "MaxPSU": 6,
  "MinPSU": 4,
  "Status": { "State": "Selected" }
}

Telemetry Service

DGX H100/H200 uses the TelemetryService for real-time power and performance metrics.

Get Metric Reports

  • Endpoint: GET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0
  • Attributes (response):
    • MetricValues[].MetricId - Metric identifier (case-sensitive)
    • MetricValues[].MetricValue - Metric value as string (e.g., "8500.0" not 8500.0)

Example (partial response):

GET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0",
  "MetricValues": [
    { "MetricId": "dcPlatformPower_avg", "MetricValue": "8500.0" },
    { "MetricId": "AvblNoGPU", "MetricValue": "8" },
    { "MetricId": "AvblNoCPU", "MetricValue": "2" },
    { "MetricId": "gpuPower_avg_0", "MetricValue": "63.0" },
    { "MetricId": "gpuPowerLimit_0", "MetricValue": "700.0" },
    { "MetricId": "gpuPowerCapabilitiesMin_0", "MetricValue": "200.0" },
    { "MetricId": "gpuPowerCapabilitiesMax_0", "MetricValue": "700.0" },
    { "MetricId": "cpuPackagePower_avg_0", "MetricValue": "182.0" },
    { "MetricId": "dramPower_avg_0", "MetricValue": "45.0" }
  ]
}

Node-Level Metrics

  • dcPlatformPower_avg - Total DC platform power in Watts
  • AvblNoGPU - Available GPU count
  • AvblNoCPU - Available CPU count

Per-GPU Metrics (for each {id} from 0 to 7)

  • gpuPower_avg_{id} - GPU average power in Watts
  • gpuPowerLimit_{id} - GPU power limit in Watts
  • gpuPowerCapabilitiesMin_{id} - GPU minimum power limit in Watts
  • gpuPowerCapabilitiesMax_{id} - GPU maximum power limit in Watts

Per-CPU Metrics (for each {id} from 0 to 1)

  • cpuPackagePower_avg_{id} - CPU average power in Watts
  • cpuPackagePowerLimit1_{id} - CPU power limit 1 in Watts
  • cpuPackagePowerCapabilitiesMin_{id} - CPU minimum power in Watts
  • cpuPackagePowerCapabilitiesMax_{id} - CPU maximum power in Watts
  • cpuEnergy_{id} - CPU energy in kWh

Per-Memory Metrics (for each {id} from 0 to 1)

  • dramPower_avg_{id} - DRAM average power in Watts
  • dramPowerLimit_{id} - DRAM power limit in Watts
  • dramPackagePowerCapabilitiesMin_{id} - DRAM minimum power in Watts
  • dramPackagePowerCapabilitiesMax_{id} - DRAM maximum power in Watts
  • dramEnergy_{id} - DRAM energy in kWh

Firmware Validation

  • Endpoint: GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0
  • Attributes (response):
    • Version - BMC firmware version; must be >= 24.0.0

Example:

GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0",
  "Id": "HGX_FW_BMC_0",
  "Name": "HGX BMC",
  "Version": "24.08.25.01"
}

DGX B200 Systems

Reference: NVIDIA DGX B200 Redfish API Guide

DGX B200 systems use EnvironmentMetrics API for direct per-component power management.

Processors Collection

  • Endpoint: GET /redfish/v1/Systems/HGX_Baseboard_0/Processors
  • Attributes (response):
    • Members[].@odata.id - URLs to individual processors:
      • GPU_0 through GPU_7 - GPUs
      • CPU_0, CPU_1 - CPUs

Example:

GET /redfish/v1/Systems/HGX_Baseboard_0/Processors
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors",
  "Members": [
    { "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0" },
    { "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_1" },
    { "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_7" },
    { "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/CPU_0" },
    { "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/CPU_1" }
  ],
  "Members@odata.count": 10
}

Get Processor

  • Endpoint: GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}
  • Path Parameters:
    • {processor_id} - Processor ID from collection (GPU_0-GPU_7 or CPU_0-CPU_1)
  • Attributes (response):
    • Id - Processor identifier
    • Oem.Nvidia.WorkloadPowerProfile.@odata.id - Presence indicates WPPS support; provides URL to WPPS endpoint

Example (GPU with WPPS support):

GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0",
  "Id": "GPU_0",
  "Name": "GPU 0",
  "Oem": {
    "Nvidia": {
      "WorkloadPowerProfile": {
        "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile"
      }
    }
  }
}

EnvironmentMetrics

Get Environment Metrics

  • Endpoint: GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics
  • Path Parameters:
    • {processor_id} - Processor ID (GPU_0-GPU_7 or CPU_0-CPU_1)
  • Attributes (response):
    • PowerWatts.Reading - Current instantaneous power in Watts
    • PowerLimitWatts.SetPoint - Current power limit setpoint in Watts
    • PowerLimitWatts.AllowableMin - Minimum allowed SetPoint value
    • PowerLimitWatts.AllowableMax - Maximum allowed SetPoint value
    • PowerLimitWatts.DefaultSetPoint - Factory default setpoint in Watts
    • EnergyJoules.Reading - Cumulative energy in Joules
    • EnergykWh.Reading - Cumulative energy in kWh
    • TemperatureCelsius.Reading - Current temperature in Celsius

Example (GPU):

GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetrics
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetrics",
  "EnergyJoules": { "Reading": 12345.67 },
  "EnergykWh": { "Reading": 1.234 },
  "PowerWatts": { "Reading": 64.342 },
  "TemperatureCelsius": { "Reading": 45.5 },
  "PowerLimitWatts": {
    "Reading": 700.0,
    "SetPoint": 700,
    "AllowableMin": 200,
    "AllowableMax": 1000,
    "DefaultSetPoint": 700
  }
}

Set Power Limit

  • Endpoint: PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics
  • Path Parameters:
    • {processor_id} - Processor ID (GPU_0-GPU_7 or CPU_0-CPU_1)
  • Arguments (request payload):
    • PowerLimitWatts.SetPoint - Power limit in Watts; must be within AllowableMin to AllowableMax range

Example:

PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetrics
Content-Type: application/json

{
  "PowerLimitWatts": {
    "SetPoint": 700
  }
}
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@Message.ExtendedInfo": [
    {
      "@odata.type": "#Message.v1_1_1.Message",
      "Message": "The request completed successfully.",
      "MessageId": "Base.1.18.1.Success",
      "Severity": "OK",
      "Resolution": "None."
    }
  ]
}

Example Error Response (Out of Range):

HTTP/1.1 400 Bad Request
Content-Type: application/json

{
  "error": {
    "code": "Base.1.13.0.PropertyValueNotInList",
    "message": "Invalid SetPoint: value 50 is out of range [200, 1000]",
    "@Message.ExtendedInfo": [
      {
        "@odata.type": "#Message.v1_1_1.Message",
        "MessageId": "Base.1.13.0.PropertyValueNotInList",
        "Message": "Invalid SetPoint: value 50 is out of range [200, 1000]",
        "Severity": "Warning",
        "Resolution": "Correct the value for the property in the request body and resubmit the request if the operation failed."
      }
    ]
  }
}

Chassis Environment Metrics

For total system power:

  • Endpoint: GET /redfish/v1/Chassis/Chassis_0/EnvironmentMetrics
  • Attributes (response):
    • PowerWatts.Reading - Total chassis power for node-level metrics

Example:

GET /redfish/v1/Chassis/Chassis_0/EnvironmentMetrics
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Chassis/Chassis_0/EnvironmentMetrics",
  "PowerWatts": { "Reading": 8500.0 }
}

Workload Power Profiles (WPPS)

WPPS allows fine-grained control over GPU power behavior for specific workload types. Profiles are identified by index (0-255) and represented as hex bitmasks.

Get WPPS Configuration

  • Endpoint: GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile
  • Path Parameters:
    • {gpu_id} - GPU ID (GPU_0-GPU_7)
  • Attributes (response):
    • SupportedProfileMask - Hex mask of available profiles on this GPU
    • RequestedProfileMask - Hex mask of requested profiles (may differ from enforced during transition)
    • EnforcedProfileMask - Hex mask of currently active profiles

Example:

GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile",
  "SupportedProfileMask": "0x10f98",
  "RequestedProfileMask": "0x0",
  "EnforcedProfileMask": "0x0"
}

Enable Workload Profiles

  • Endpoint: POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles
  • Path Parameters:
    • {gpu_id} - GPU ID (GPU_0-GPU_7)
  • Arguments (request payload):
    • ProfileMask - Hex mask of profiles to enable; must use 0x prefix (e.g., "0x7" for profiles 0, 1, 2)

Example:

POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles
Content-Type: application/json

{
  "ProfileMask": "0x7"
}
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@Message.ExtendedInfo": [
    {
      "@odata.type": "#Message.v1_1_1.Message",
      "Message": "The request completed successfully.",
      "MessageId": "Base.1.18.1.Success",
      "Severity": "OK",
      "Resolution": "None."
    }
  ]
}

Disable Workload Profiles

  • Endpoint: POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles
  • Path Parameters:
    • {gpu_id} - GPU ID (GPU_0-GPU_7)
  • Arguments (request payload):
    • ProfileMask - Hex mask of profiles to disable; must use 0x prefix (e.g., "0x7" to disable profiles 0, 1, 2)

Example:

POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles
Content-Type: application/json

{
  "ProfileMask": "0x7"
}
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@Message.ExtendedInfo": [
    {
      "@odata.type": "#Message.v1_1_1.Message",
      "Message": "The request completed successfully.",
      "MessageId": "Base.1.18.1.Success",
      "Severity": "OK",
      "Resolution": "None."
    }
  ]
}

Profile Mask Format

The ProfileMask is a hex string with 0x prefix where each bit position represents a profile ID:

  • "0x0" - No profiles enabled
  • "0x1" - Profile 0 only (bit 0 set)
  • "0x7" - Profiles 0, 1, and 2 (bits 0, 1, 2 set)

Firmware Validation

  • Endpoint: GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0
  • Attributes (response):
    • Version - BMC firmware version; must be >= 25.0.0

Example:

GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HostBMC_0",
  "Id": "HostBMC_0",
  "Name": "Host BMC",
  "Version": "25.04.17.00"
}

DGX GB200 NVL Systems

DGX GB200 NVL systems use EnvironmentMetrics API similar to B200, with additional support for multi-GPU processor modules.

Processor EnvironmentMetrics

Same API as DGX B200, with different processor counts:

  • GPU Endpoint: GET/PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/EnvironmentMetrics
    • {gpu_id} - GPU ID (GPU_0-GPU_3, 4 GPUs total)
  • CPU Endpoint: GET/PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{cpu_id}/EnvironmentMetrics
    • {cpu_id} - CPU ID (CPU_0-CPU_1, 2 CPUs total)

Processor Module EnvironmentMetrics

For multi-GPU module power management:

Get Module Metrics

  • Endpoint: GET /redfish/v1/Chassis/HGX_ProcessorModule_{index}/EnvironmentMetrics
  • Path Parameters:
    • {index} - Processor module index (0 or 1)
  • Attributes (response): Same as processor EnvironmentMetrics - PowerWatts.Reading, PowerLimitWatts.* fields

Example:

GET /redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetrics
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetrics",
  "PowerWatts": { "Reading": 2500.0 },
  "PowerLimitWatts": {
    "Reading": 2820.0,
    "SetPoint": 2820,
    "AllowableMin": 1500,
    "AllowableMax": 3000,
    "DefaultSetPoint": 2820
  }
}

Set Module Power Limit

  • Endpoint: PATCH /redfish/v1/Chassis/HGX_ProcessorModule_{index}/EnvironmentMetrics
  • Path Parameters:
    • {index} - Processor module index (0 or 1)
  • Arguments (request payload):
    • PowerLimitWatts.SetPoint - Module power limit in Watts

Example:

PATCH /redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetrics
Content-Type: application/json

{
  "PowerLimitWatts": {
    "SetPoint": 2820
  }
}
HTTP/1.1 200 OK
Content-Type: application/json

Workload Power Profiles (WPPS)

Same as DGX B200 - see DGX B200 WPPS section.

Firmware Validation

  • Endpoint: GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0
  • Attributes (response):
    • Oem.Nvidia.ActiveFirmwareSlot.Version - BMC firmware version (NVIDIA OEM format); must be >= 25.0.0

Example (NVIDIA OEM format):

GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0
HTTP/1.1 200 OK
Content-Type: application/json

{
  "@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0",
  "Id": "FW_BMC_0",
  "Name": "BMC firmware",
  "Oem": {
    "Nvidia": {
      "ActiveFirmwareSlot": {
        "Version": "25.25.2",
        "FirmwareState": "Activated"
      }
    }
  }
}

DGX B300 / B300 NVL / GB300 NVL Systems

These systems follow API patterns similar to DGX B200 and GB200 NVL:

  • Power Management: EnvironmentMetrics API
  • WPPS Support: Yes

For multi-module systems (B300 NVL, GB300 NVL), processor module endpoints are available similar to GB200 NVL.

Firmware Validation

  • DGX B300, B300 NVL: GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0
  • DGX GB300 NVL: GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0

Complete Endpoint Reference

Session Management (All Platforms)

  • POST /redfish/v1/SessionService/Sessions - Create session
  • DELETE /redfish/v1/SessionService/Sessions/{id} - Delete session

Firmware Inventory (All Platforms)

  • GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0 - DGX H100/H200
  • GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0 - DGX B200, B300, B300 NVL
  • GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0 - DGX GB200 NVL, GB300 NVL

Node Manager (DGX H100/H200 Only)

  • GET /redfish/v1/Managers/BMC/NodeManager/Domains - List domains
  • GET /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Get domain
  • POST /redfish/v1/Managers/BMC/NodeManager/Domains - Create domain
  • PATCH /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Update domain
  • DELETE /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Delete domain
  • GET /redfish/v1/Managers/BMC/NodeManager/Domains/{domain_id}/Policies/{policy_id} - Get policy
  • GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies - List PSU policies
  • GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/{id} - Get PSU policy

Telemetry (DGX H100/H200)

  • GET /redfish/v1/TelemetryService/MetricReports - List metric reports
  • GET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0 - Get NVIDIA Node Manager metrics

Processors (DGX B200, GB200, B300 Series)

  • GET /redfish/v1/Systems/HGX_Baseboard_0/Processors - List processors
  • GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{id} - Get processor

Environment Metrics (DGX B200, GB200, B300 Series)

  • GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics - Get processor metrics
  • PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics - Set power limit
  • GET /redfish/v1/Chassis/Chassis_0/EnvironmentMetrics - Get chassis metrics
  • GET /redfish/v1/Chassis/HGX_ProcessorModule_{id}/EnvironmentMetrics - Get processor module metrics (GB200/GB300)
  • PATCH /redfish/v1/Chassis/HGX_ProcessorModule_{id}/EnvironmentMetrics - Set module power limit (GB200/GB300)

Workload Power Profiles (DGX B200, GB200, B300 Series)

  • GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile - Get WPPS configuration
  • POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles - Enable profiles
  • POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles - Disable profiles

References

NVIDIA Platform Documentation

DMTF Redfish Specification