Redfish API
Redfish API
Overview
Redfish is a standard RESTful API specification developed by the Distributed Management Task Force (DMTF) for managing and monitoring server hardware, storage, networking, and converged infrastructure. It provides a standardized way to interact with Baseboard Management Controllers (BMCs) and other hardware management interfaces.
DPS uses Redfish as its primary interface for communicating with NVIDIA DGX compute nodes. Through Redfish, DPS can:
- Set and monitor power limits for nodes, GPUs, CPUs, and memory
- Collect power consumption metrics and performance telemetry
- Manage Workload Power Profile Settings(WPPS) for supported platforms
Supported Platforms
DPS supports the following NVIDIA DGX systems via Redfish API:
DGX H100/H200
- Power Management: Node Manager (domain-based power capping)
- WPPS Support: No
- Minimum BMC Version: 24.0.0
- Documentation: NVIDIA DGX H100/H200 Redfish API Guide
DGX B200
- Power Management: EnvironmentMetrics (per-component power limits)
- WPPS Support: Yes
- Minimum BMC Version: 25.0.0
- Documentation: NVIDIA DGX B200 Redfish API Guide
DGX GB200 NVL
- Power Management: EnvironmentMetrics (per-component power limits)
- WPPS Support: Yes
- Minimum BMC Version: 25.0.0
DGX B300 / B300 NVL / GB300 NVL
- Power Management: EnvironmentMetrics (per-component power limits)
- WPPS Support: Yes
- Minimum BMC Version: 25.0.0
Session Management
DPS creates authenticated sessions with BMC for all subsequent API calls. Sessions are maintained as long-lived connections and reused for efficiency.
For detailed session management specifications, see DMTF Redfish Specification DSP0266 v1.23.0 - Section 9.2.4 Session Management.
Create Session
- Endpoint:
POST /redfish/v1/SessionService/Sessions - Arguments (request payload):
UserName(string) - BMC username from Kubernetes secretPassword(string) - BMC password from Kubernetes secret
- Attributes (response headers):
X-Auth-Token- Session token for subsequent API requestsLocation- Session URL for deletion
Example:
POST /redfish/v1/SessionService/Sessions
Content-Type: application/json
{
"UserName": "admin",
"Password": "password123"
}HTTP/1.1 201 Created
X-Auth-Token: abc123-session-token-xyz789
Location: /redfish/v1/SessionService/Sessions/1Example Error Response (Invalid Credentials):
HTTP/1.1 401 Unauthorized
Content-Type: application/json
{
"error": {
"code": "Base.1.13.0.SecurityAccessDenied",
"message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access.",
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"MessageId": "Base.1.13.0.SecurityAccessDenied",
"Message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access.",
"Severity": "Critical",
"Resolution": "Attempt to ensure that the URI is correct and that the service has the appropriate credentials."
}
]
}
}Delete Session
- Endpoint:
DELETE /redfish/v1/SessionService/Sessions/{session_id} - Path Parameters:
{session_id}- Session ID from Location header
- Request Headers:
X-Auth-Token- Session token from create response
Example:
DELETE /redfish/v1/SessionService/Sessions/1
X-Auth-Token: abc123-session-token-xyz789HTTP/1.1 204 No ContentDGX H100/H200 Systems
Reference: NVIDIA DGX H100/H200 Redfish API Guide
DGX H100/H200 systems use the Node Manager API for power management through
domains and policies. DPS creates a custom
domain named dps-managed-domain to manage power allocation.
Node Manager Domains
Domains represent power management scopes. DPS manages power through a dedicated domain.
List Domains
- Endpoint:
GET /redfish/v1/Managers/BMC/NodeManager/Domains - Attributes (response):
Members[].@odata.id- URLs to individual domain resources
Example:
GET /redfish/v1/Managers/BMC/NodeManager/DomainsHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains",
"Members": [
{ "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/0" },
{ "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1" }
],
"Members@odata.count": 2
}Get Domain
- Endpoint:
GET /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Attributes (response):
Id- Domain identifier for update/delete operationsName- Domain name;"dps-managed-domain"indicates managed domainCapabilities.Max- Maximum power capability in WattsCapabilities.Min- Minimum power capability in WattsPolicies.Members[]- Nested policy objects
Example:
GET /redfish/v1/Managers/BMC/NodeManager/Domains/1HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1",
"Id": "1",
"Name": "dps-managed-domain",
"Status": { "State": "Enabled" },
"Capabilities": {
"Max": 10200,
"Min": 2500
},
"Policies": {
"Members": [
{
"ComponentId": "COMP_GPU",
"Limit": 5600.0,
"PercentageOfDomainBudget": 75.0,
"Status": { "State": "Selected" }
},
{
"ComponentId": "COMP_CPU",
"Limit": 700.0,
"PercentageOfDomainBudget": 10.0,
"Status": { "State": "Selected" }
}
]
}
}Create Domain
- Endpoint:
POST /redfish/v1/Managers/BMC/NodeManager/Domains - Arguments (request payload):
Id- Domain identifier (default:"0")Name- Always"dps-managed-domain"(identifies managed domains)Status.State- Always"Enabled"(required for active state)Capabilities.Max- Node maximum power limit in WattsCapabilities.Min- Node minimum power limit in WattsPolicies.Members[]- Array of policy objects (see Policy Object Fields)
Example:
POST /redfish/v1/Managers/BMC/NodeManager/Domains
Content-Type: application/json
{
"Id": "1",
"Name": "dps-managed-domain",
"Status": { "State": "Enabled" },
"Capabilities": {
"Max": 10200,
"Min": 2500
},
"Policies": {
"Members": [
{
"ComponentId": "COMP_GPU",
"Limit": 5600.0,
"PercentageOfDomainBudget": 75.0,
"Status": { "State": "Selected" }
},
{
"ComponentId": "COMP_CPU",
"Limit": 700.0,
"PercentageOfDomainBudget": 10.0,
"Status": { "State": "Selected" }
},
{
"ComponentId": "COMP_MEMORY",
"Limit": 200.0,
"PercentageOfDomainBudget": 5.0,
"Status": { "State": "Selected" }
}
]
}
}HTTP/1.1 204 No ContentUpdate Domain
- Endpoint:
PATCH /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Arguments (request payload):
Capabilities.Max- Maximum power limit in WattsCapabilities.Min- Minimum power limit in WattsPolicies.Members[]- Array of policy objects (see Policy Object Fields)
Example:
PATCH /redfish/v1/Managers/BMC/NodeManager/Domains/1
Content-Type: application/json
{
"Capabilities": {
"Max": 9500,
"Min": 2000
},
"Policies": {
"Members": [
{
"ComponentId": "COMP_GPU",
"Limit": 5000.0,
"PercentageOfDomainBudget": 70.0,
"Status": { "State": "Selected" }
},
{
"ComponentId": "COMP_CPU",
"Limit": 600.0,
"PercentageOfDomainBudget": 10.0,
"Status": { "State": "Selected" }
},
{
"ComponentId": "COMP_MEMORY",
"Limit": 180.0,
"PercentageOfDomainBudget": 5.0,
"Status": { "State": "Selected" }
}
]
}
}HTTP/1.1 200 OK
Content-Type: application/json
{
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"Message": "The request completed successfully.",
"MessageId": "Base.1.18.1.Success",
"Severity": "OK",
"Resolution": "None."
}
]
}Delete Domain
- Endpoint:
DELETE /redfish/v1/Managers/BMC/NodeManager/Domains/{id} - Path Parameters:
{id}- Domain identifier
Example:
DELETE /redfish/v1/Managers/BMC/NodeManager/Domains/1HTTP/1.1 204 No ContentDomain Policies
Policies define power limits for specific components within a domain. Each
policy object in Policies.Members[] contains the following fields:
Policy Object Fields (arguments when creating/updating)
ComponentId(string) - Component type identifier (exact string required):"COMP_CPU"- CPU component"COMP_MEMORY"- Memory/DRAM component"COMP_GPU"- GPU component
Limit(float) - Power limit in WattsPercentageOfDomainBudget(float) - Percentage of total domain budget (0.0-100.0)Status.State(string) - Always"Selected"(required for active policies)
Get Policy
- Endpoint:
GET /redfish/v1/Managers/BMC/NodeManager/Domains/{domain_id}/Policies/{policy_id} - Path Parameters:
{domain_id}- Domain identifier{policy_id}- Policy identifier
- Attributes (response):
ComponentId- Component type ("COMP_CPU","COMP_MEMORY","COMP_GPU")Limit- Power limit in WattsPercentageOfDomainBudget- Budget percentage allocation
Example:
GET /redfish/v1/Managers/BMC/NodeManager/Domains/1/Policies/0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Managers/BMC/NodeManager/Domains/1/Policies/0",
"Id": "0",
"ComponentId": "COMP_GPU",
"Limit": 5600.0,
"PercentageOfDomainBudget": 75.0,
"Status": { "State": "Selected" }
}PSU Policies
PSU policies define power limits based on PSU redundancy configuration.
DPS validates that requested power caps don’t
exceed the active PSU policy’s LimitMax.
List PSU Policies
- Endpoint:
GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies
Example:
GET /redfish/v1/Managers/BMC/NodeManager/PSUPoliciesHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies",
"Members": [
{ "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/0" },
{ "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/1" },
{ "@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2" }
],
"Members@odata.count": 3
}Get PSU Policy
- Endpoint:
GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/{id} - Path Parameter Values:
"0"- Limp mode (minimal PSU configuration)"1"- No Redundancy"2"- Full Redundancy
- Attributes (response):
LimitMax- Maximum power limit in Watts; power caps must not exceed this value
Example (Full Redundancy):
GET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Managers/BMC/NodeManager/PSUPolicies/2",
"Id": "2",
"Name": "Full Redundancy",
"LimitMax": 12000,
"MaxPSU": 6,
"MinPSU": 4,
"Status": { "State": "Selected" }
}Telemetry Service
DGX H100/H200 uses the TelemetryService for real-time power and performance metrics.
Get Metric Reports
- Endpoint:
GET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0 - Attributes (response):
MetricValues[].MetricId- Metric identifier (case-sensitive)MetricValues[].MetricValue- Metric value as string (e.g.,"8500.0"not8500.0)
Example (partial response):
GET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0",
"MetricValues": [
{ "MetricId": "dcPlatformPower_avg", "MetricValue": "8500.0" },
{ "MetricId": "AvblNoGPU", "MetricValue": "8" },
{ "MetricId": "AvblNoCPU", "MetricValue": "2" },
{ "MetricId": "gpuPower_avg_0", "MetricValue": "63.0" },
{ "MetricId": "gpuPowerLimit_0", "MetricValue": "700.0" },
{ "MetricId": "gpuPowerCapabilitiesMin_0", "MetricValue": "200.0" },
{ "MetricId": "gpuPowerCapabilitiesMax_0", "MetricValue": "700.0" },
{ "MetricId": "cpuPackagePower_avg_0", "MetricValue": "182.0" },
{ "MetricId": "dramPower_avg_0", "MetricValue": "45.0" }
]
}Node-Level Metrics
dcPlatformPower_avg- Total DC platform power in WattsAvblNoGPU- Available GPU countAvblNoCPU- Available CPU count
Per-GPU Metrics (for each {id} from 0 to 7)
gpuPower_avg_{id}- GPU average power in WattsgpuPowerLimit_{id}- GPU power limit in WattsgpuPowerCapabilitiesMin_{id}- GPU minimum power limit in WattsgpuPowerCapabilitiesMax_{id}- GPU maximum power limit in Watts
Per-CPU Metrics (for each {id} from 0 to 1)
cpuPackagePower_avg_{id}- CPU average power in WattscpuPackagePowerLimit1_{id}- CPU power limit 1 in WattscpuPackagePowerCapabilitiesMin_{id}- CPU minimum power in WattscpuPackagePowerCapabilitiesMax_{id}- CPU maximum power in WattscpuEnergy_{id}- CPU energy in kWh
Per-Memory Metrics (for each {id} from 0 to 1)
dramPower_avg_{id}- DRAM average power in WattsdramPowerLimit_{id}- DRAM power limit in WattsdramPackagePowerCapabilitiesMin_{id}- DRAM minimum power in WattsdramPackagePowerCapabilitiesMax_{id}- DRAM maximum power in WattsdramEnergy_{id}- DRAM energy in kWh
Firmware Validation
- Endpoint:
GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0 - Attributes (response):
Version- BMC firmware version; must be >= 24.0.0
Example:
GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0",
"Id": "HGX_FW_BMC_0",
"Name": "HGX BMC",
"Version": "24.08.25.01"
}DGX B200 Systems
Reference: NVIDIA DGX B200 Redfish API Guide
DGX B200 systems use EnvironmentMetrics API for direct per-component power management.
Processors Collection
- Endpoint:
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors - Attributes (response):
Members[].@odata.id- URLs to individual processors:GPU_0throughGPU_7- GPUsCPU_0,CPU_1- CPUs
Example:
GET /redfish/v1/Systems/HGX_Baseboard_0/ProcessorsHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors",
"Members": [
{ "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0" },
{ "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_1" },
{ "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_7" },
{ "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/CPU_0" },
{ "@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/CPU_1" }
],
"Members@odata.count": 10
}Get Processor
- Endpoint:
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id} - Path Parameters:
{processor_id}- Processor ID from collection (GPU_0-GPU_7orCPU_0-CPU_1)
- Attributes (response):
Id- Processor identifierOem.Nvidia.WorkloadPowerProfile.@odata.id- Presence indicates WPPS support; provides URL to WPPS endpoint
Example (GPU with WPPS support):
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0",
"Id": "GPU_0",
"Name": "GPU 0",
"Oem": {
"Nvidia": {
"WorkloadPowerProfile": {
"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile"
}
}
}
}EnvironmentMetrics
Get Environment Metrics
- Endpoint:
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics - Path Parameters:
{processor_id}- Processor ID (GPU_0-GPU_7orCPU_0-CPU_1)
- Attributes (response):
PowerWatts.Reading- Current instantaneous power in WattsPowerLimitWatts.SetPoint- Current power limit setpoint in WattsPowerLimitWatts.AllowableMin- Minimum allowed SetPoint valuePowerLimitWatts.AllowableMax- Maximum allowed SetPoint valuePowerLimitWatts.DefaultSetPoint- Factory default setpoint in WattsEnergyJoules.Reading- Cumulative energy in JoulesEnergykWh.Reading- Cumulative energy in kWhTemperatureCelsius.Reading- Current temperature in Celsius
Example (GPU):
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetricsHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetrics",
"EnergyJoules": { "Reading": 12345.67 },
"EnergykWh": { "Reading": 1.234 },
"PowerWatts": { "Reading": 64.342 },
"TemperatureCelsius": { "Reading": 45.5 },
"PowerLimitWatts": {
"Reading": 700.0,
"SetPoint": 700,
"AllowableMin": 200,
"AllowableMax": 1000,
"DefaultSetPoint": 700
}
}Set Power Limit
- Endpoint:
PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics - Path Parameters:
{processor_id}- Processor ID (GPU_0-GPU_7orCPU_0-CPU_1)
- Arguments (request payload):
PowerLimitWatts.SetPoint- Power limit in Watts; must be withinAllowableMintoAllowableMaxrange
Example:
PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/EnvironmentMetrics
Content-Type: application/json
{
"PowerLimitWatts": {
"SetPoint": 700
}
}HTTP/1.1 200 OK
Content-Type: application/json
{
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"Message": "The request completed successfully.",
"MessageId": "Base.1.18.1.Success",
"Severity": "OK",
"Resolution": "None."
}
]
}Example Error Response (Out of Range):
HTTP/1.1 400 Bad Request
Content-Type: application/json
{
"error": {
"code": "Base.1.13.0.PropertyValueNotInList",
"message": "Invalid SetPoint: value 50 is out of range [200, 1000]",
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"MessageId": "Base.1.13.0.PropertyValueNotInList",
"Message": "Invalid SetPoint: value 50 is out of range [200, 1000]",
"Severity": "Warning",
"Resolution": "Correct the value for the property in the request body and resubmit the request if the operation failed."
}
]
}
}Chassis Environment Metrics
For total system power:
- Endpoint:
GET /redfish/v1/Chassis/Chassis_0/EnvironmentMetrics - Attributes (response):
PowerWatts.Reading- Total chassis power for node-level metrics
Example:
GET /redfish/v1/Chassis/Chassis_0/EnvironmentMetricsHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Chassis/Chassis_0/EnvironmentMetrics",
"PowerWatts": { "Reading": 8500.0 }
}Workload Power Profiles (WPPS)
WPPS allows fine-grained control over GPU power behavior for specific workload types. Profiles are identified by index (0-255) and represented as hex bitmasks.
Get WPPS Configuration
- Endpoint:
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile - Path Parameters:
{gpu_id}- GPU ID (GPU_0-GPU_7)
- Attributes (response):
SupportedProfileMask- Hex mask of available profiles on this GPURequestedProfileMask- Hex mask of requested profiles (may differ from enforced during transition)EnforcedProfileMask- Hex mask of currently active profiles
Example:
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfileHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile",
"SupportedProfileMask": "0x10f98",
"RequestedProfileMask": "0x0",
"EnforcedProfileMask": "0x0"
}Enable Workload Profiles
- Endpoint:
POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles - Path Parameters:
{gpu_id}- GPU ID (GPU_0-GPU_7)
- Arguments (request payload):
ProfileMask- Hex mask of profiles to enable; must use0xprefix (e.g.,"0x7"for profiles 0, 1, 2)
Example:
POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles
Content-Type: application/json
{
"ProfileMask": "0x7"
}HTTP/1.1 200 OK
Content-Type: application/json
{
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"Message": "The request completed successfully.",
"MessageId": "Base.1.18.1.Success",
"Severity": "OK",
"Resolution": "None."
}
]
}Disable Workload Profiles
- Endpoint:
POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles - Path Parameters:
{gpu_id}- GPU ID (GPU_0-GPU_7)
- Arguments (request payload):
ProfileMask- Hex mask of profiles to disable; must use0xprefix (e.g.,"0x7"to disable profiles 0, 1, 2)
Example:
POST /redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_0/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles
Content-Type: application/json
{
"ProfileMask": "0x7"
}HTTP/1.1 200 OK
Content-Type: application/json
{
"@Message.ExtendedInfo": [
{
"@odata.type": "#Message.v1_1_1.Message",
"Message": "The request completed successfully.",
"MessageId": "Base.1.18.1.Success",
"Severity": "OK",
"Resolution": "None."
}
]
}Profile Mask Format
The ProfileMask is a hex string with 0x prefix where each bit position
represents a profile ID:
"0x0"- No profiles enabled"0x1"- Profile 0 only (bit 0 set)"0x7"- Profiles 0, 1, and 2 (bits 0, 1, 2 set)
Firmware Validation
- Endpoint:
GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0 - Attributes (response):
Version- BMC firmware version; must be >= 25.0.0
Example:
GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HostBMC_0",
"Id": "HostBMC_0",
"Name": "Host BMC",
"Version": "25.04.17.00"
}DGX GB200 NVL Systems
DGX GB200 NVL systems use EnvironmentMetrics API similar to B200, with additional support for multi-GPU processor modules.
Processor EnvironmentMetrics
Same API as DGX B200, with different processor counts:
- GPU Endpoint:
GET/PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/EnvironmentMetrics{gpu_id}- GPU ID (GPU_0-GPU_3, 4 GPUs total)
- CPU Endpoint:
GET/PATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{cpu_id}/EnvironmentMetrics{cpu_id}- CPU ID (CPU_0-CPU_1, 2 CPUs total)
Processor Module EnvironmentMetrics
For multi-GPU module power management:
Get Module Metrics
- Endpoint:
GET /redfish/v1/Chassis/HGX_ProcessorModule_{index}/EnvironmentMetrics - Path Parameters:
{index}- Processor module index (0or1)
- Attributes (response): Same as processor EnvironmentMetrics -
PowerWatts.Reading,PowerLimitWatts.*fields
Example:
GET /redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetricsHTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetrics",
"PowerWatts": { "Reading": 2500.0 },
"PowerLimitWatts": {
"Reading": 2820.0,
"SetPoint": 2820,
"AllowableMin": 1500,
"AllowableMax": 3000,
"DefaultSetPoint": 2820
}
}Set Module Power Limit
- Endpoint:
PATCH /redfish/v1/Chassis/HGX_ProcessorModule_{index}/EnvironmentMetrics - Path Parameters:
{index}- Processor module index (0or1)
- Arguments (request payload):
PowerLimitWatts.SetPoint- Module power limit in Watts
Example:
PATCH /redfish/v1/Chassis/HGX_ProcessorModule_0/EnvironmentMetrics
Content-Type: application/json
{
"PowerLimitWatts": {
"SetPoint": 2820
}
}HTTP/1.1 200 OK
Content-Type: application/jsonWorkload Power Profiles (WPPS)
Same as DGX B200 - see DGX B200 WPPS section.
Firmware Validation
- Endpoint:
GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0 - Attributes (response):
Oem.Nvidia.ActiveFirmwareSlot.Version- BMC firmware version (NVIDIA OEM format); must be >= 25.0.0
Example (NVIDIA OEM format):
GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0HTTP/1.1 200 OK
Content-Type: application/json
{
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0",
"Id": "FW_BMC_0",
"Name": "BMC firmware",
"Oem": {
"Nvidia": {
"ActiveFirmwareSlot": {
"Version": "25.25.2",
"FirmwareState": "Activated"
}
}
}
}DGX B300 / B300 NVL / GB300 NVL Systems
These systems follow API patterns similar to DGX B200 and GB200 NVL:
- Power Management: EnvironmentMetrics API
- WPPS Support: Yes
For multi-module systems (B300 NVL, GB300 NVL), processor module endpoints are available similar to GB200 NVL.
Firmware Validation
- DGX B300, B300 NVL:
GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0 - DGX GB300 NVL:
GET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0
Complete Endpoint Reference
Session Management (All Platforms)
POST /redfish/v1/SessionService/Sessions- Create sessionDELETE /redfish/v1/SessionService/Sessions/{id}- Delete session
Firmware Inventory (All Platforms)
GET /redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0- DGX H100/H200GET /redfish/v1/UpdateService/FirmwareInventory/HostBMC_0- DGX B200, B300, B300 NVLGET /redfish/v1/UpdateService/FirmwareInventory/FW_BMC_0- DGX GB200 NVL, GB300 NVL
Node Manager (DGX H100/H200 Only)
GET /redfish/v1/Managers/BMC/NodeManager/Domains- List domainsGET /redfish/v1/Managers/BMC/NodeManager/Domains/{id}- Get domainPOST /redfish/v1/Managers/BMC/NodeManager/Domains- Create domainPATCH /redfish/v1/Managers/BMC/NodeManager/Domains/{id}- Update domainDELETE /redfish/v1/Managers/BMC/NodeManager/Domains/{id}- Delete domainGET /redfish/v1/Managers/BMC/NodeManager/Domains/{domain_id}/Policies/{policy_id}- Get policyGET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies- List PSU policiesGET /redfish/v1/Managers/BMC/NodeManager/PSUPolicies/{id}- Get PSU policy
Telemetry (DGX H100/H200)
GET /redfish/v1/TelemetryService/MetricReports- List metric reportsGET /redfish/v1/TelemetryService/MetricReports/NvidiaNMMetrics_0- Get NVIDIA Node Manager metrics
Processors (DGX B200, GB200, B300 Series)
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors- List processorsGET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{id}- Get processor
Environment Metrics (DGX B200, GB200, B300 Series)
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics- Get processor metricsPATCH /redfish/v1/Systems/HGX_Baseboard_0/Processors/{processor_id}/EnvironmentMetrics- Set power limitGET /redfish/v1/Chassis/Chassis_0/EnvironmentMetrics- Get chassis metricsGET /redfish/v1/Chassis/HGX_ProcessorModule_{id}/EnvironmentMetrics- Get processor module metrics (GB200/GB300)PATCH /redfish/v1/Chassis/HGX_ProcessorModule_{id}/EnvironmentMetrics- Set module power limit (GB200/GB300)
Workload Power Profiles (DGX B200, GB200, B300 Series)
GET /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile- Get WPPS configurationPOST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.EnableProfiles- Enable profilesPOST /redfish/v1/Systems/HGX_Baseboard_0/Processors/{gpu_id}/Oem/Nvidia/WorkloadPowerProfile/Actions/NvidiaWorkloadPower.DisableProfiles- Disable profiles
References
NVIDIA Platform Documentation
DMTF Redfish Specification
- DMTF Redfish Specification
- DMTF Redfish Specification DSP0266 v1.23.0 (PDF) - Session Management (Section 9.2.4)
- Redfish Schema Index