NVIDIA Network Operator v26.1.0

[TECH PREVIEW] NVIDIA Spectrum-X NIC Configuration

NVIDIA NIC Configuration Operator offers NVIDIA Spectrum-X-specific NIC configuration for different versions of the Reference Architecture (RA1.3, RA2.0, and RA2.1). RA2.1 introduces multiplane mode support for enhanced network performance with multiple data planes.

Note

Currently, only ConnectX-8 (device ID 1023) and BlueField-3 SuperNIC (device ID a2dc) devices are supported for Spectrum-X configuration. Hardware Packet Load Balancing (hwplb) multiplane mode is only supported on ConnectX-8.

To install the operator and for more information on the CRDs, see NIC Firmware Configuration and Configuration Details.

Note

For Spectrum-X RA2.1 and later, the DOCA SPC-X CC algorithm package is included in the operator image and does not need to be deployed separately. For RA2.0 and earlier, the package must be deployed manually using the example below.

To enable the DOCA SPC-X CC algorithm on NIC devices, the DOCA SPC-X CC .deb package for ubuntu 22.04 is required. This configuration step will be removed in the future, once the DOCA SPC-X CC algorithm is publicly available. To access the package, contact your NVIDIA CPM. The package should be available in the cluster and then its URL should be provided in the packageUrlSource field of the SpectrumXOperator CR.

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicFirmwareSource metadata: name: spectrum-x-configuration namespace: nvidia-network-operator spec: # should point to the URL of the DOCA SPC-X CC .deb package for Ubuntu 22.04 docaSpcXCCUrlSource: "https://example.com/doca-spcx-cc_3.1.0105-1_amd64.deb"

Firmware Upgrade

If the firmware on the devices needs to be updated, extend the NicFirmwareSource CR with fields for ConnectX and BlueField firmware. Make sure to use the correct firmware for your devices.

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicFirmwareSource metadata: name: spectrum-x-configuration namespace: nvidia-network-operator spec: # should point to the URL of the DOCA SPC-X CC .deb package for Ubuntu 22.04 docaSpcXCCUrlSource: "https://example.com/doca-spcx-cc_3.1.0105-1_amd64.deb" # a list of firmware binaries zip archives from the Mellanox website, can point to any URL accessible from the cluster binUrlSources: - https://www.mellanox.com/downloads/firmware/fw-ConnectX8-rel-40_46_3048-900-9X85E-00NX-MC0_Ax-UEFI-14.39.14-FlexBoot-3.8.100.signed.bin.zip # a URL to the BlueField Bundle (BFB) file, can point to any URL accessible from the cluster bfbUrlSource: - https://example.com/bf-fwbundle-3.1.0-77_25.07-prod.bfb

Configure and apply the NicFirmwareTemplate CR:

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicFirmwareTemplate metadata: name: spectrum-x-configuration namespace: nvidia-network-operator spec: nicSelector: nicType: "a2dc" # BlueField-3 SuperNIC, Can also be "1023" for ConnectX-8 template: nicFirmwareSourceRef: spectrum-x-configuration updatePolicy: Update

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicConfigurationTemplate metadata: name: spectrum-x-configuration namespace: nvidia-network-operator spec: nodeSelector: feature.node.kubernetes.io/network-sriov.capable: "true" nicSelector: nicType: a2dc # BlueField-3 SuperNIC, Can also be "1023" for ConnectX-8 template: numVfs: 1 linkType: Ethernet spectrumXOptimized: enabled: true version: "RA2.0" # For Reference Architecture v1.3, use "RA1.3" value for this field. overlay: "none" # For L3 overlay, use "l3" value for this field.

RA2.1 configuration with multiplane support

Reference Architecture 2.1 introduces multiplane mode support, allowing NICs to be configured with multiple data planes for enhanced network performance.

Note

It is recommended to perform a NIC configuration reset before applying or switching between multiplane configurations to ensure a clean and consistent initial state. See Reset NIC Configuration to Default for details.

To enable multiplane support, set spectrumXOptimized.version to RA2.1 and configure the multiplaneMode and numberOfPlanes fields.

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicConfigurationTemplate metadata: name: spectrum-x-multiplane-configuration namespace: nvidia-network-operator spec: nodeSelector: feature.node.kubernetes.io/network-sriov.capable: "true" nicSelector: nicType: "1023" # ConnectX-8. Use "a2dc" for BlueField-3 SuperNIC (hwplb not supported on BF3) template: numVfs: 1 linkType: Ethernet spectrumXOptimized: enabled: true version: "RA2.1" overlay: "none" multiplaneMode: "hwplb" # Hardware Packet Load Balancing, ConnectX-8 only numberOfPlanes: 4

Multiplane modes

The following multiplane modes are available with RA2.1:

Mode

Description

Supported NICs

Planes

none Single plane mode (no multiplane). This is the default. ConnectX-8, BF3 SuperNIC 1
swplb Software Packet Load Balancing. The NIC port is split into multiple PFs, each assigned to a separate data plane. ConnectX-8, BF3 SuperNIC 2, 4
hwplb Hardware Packet Load Balancing. Uses hardware LAG resource allocation and NIC-level plane configuration for load balancing across planes. ConnectX-8 only 2, 4
uniplane Uniplane mode. Each port is configured as a separate plane without plane-level load balancing. ConnectX-8, BF3 SuperNIC 2
Note

Multiplane modes (swplb, hwplb, uniplane) are only supported with RA2.1. For RA1.3 and RA2.0, multiplaneMode must be none and numberOfPlanes must be 1.

NIC type constraints

NIC Type

Device ID

Supported Multiplane Modes

ConnectX-8 1023 none, swplb, hwplb, uniplane
BlueField-3 SuperNIC a2dc none, swplb, uniplane
Warning

The hwplb multiplane mode is only supported on ConnectX-8 (device ID 1023). Attempting to configure hwplb on a BlueField-3 SuperNIC will be rejected by the API validation.

Configure custom interface names

The NicInterfaceNameTemplate CRD allows you to define custom naming patterns for RDMA and network device interfaces on Spectrum-X NICs. This is useful in multiplane and multi-rail deployments where predictable interface naming is required.

The operator deploys udev rules to the host to rename network and RDMA interfaces according to the specified naming template.

The template uses the following placeholders for device name construction:

  • %nic_id%: The index of the NIC in the flattened list of NICs

  • %plane_id%: The index of the plane of the specific NIC

  • %rail_id%: The index of the rail where the given NIC belongs to

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicInterfaceNameTemplate metadata: name: spectrum-x-interface-names namespace: nvidia-network-operator spec: # Number of PFs per NIC, used to calculate the number of planes per NIC pfsPerNic: 2 # Template for RDMA device names. Placeholders: %nic_id%, %plane_id%, %rail_id% rdmaDevicePrefix: "rdma_%nic_id%_%plane_id%_%rail_id%" # Template for net device names. Placeholders: %nic_id%, %plane_id%, %rail_id% netDevicePrefix: "net_%nic_id%_%plane_id%_%rail_id%" # PCI address to rail mapping. First dimension is rail index, second is NIC PCI addresses in the rail railPciAddresses: - ["0000:1a:00.0", "0000:2a:00.0"] - ["0000:3a:00.0", "0000:4a:00.0"]

The railPciAddresses field defines the PCI address to rail mapping. The first dimension is the rail index and the second dimension is the list of PCI addresses of the NICs in that rail.

Generated udev rules

The operator generates udev rules based on the template and writes them to the host. The rules are written to two separate files.

Example generated udev rules for net devices (/etc/udev/rules.d/10-nic-net-interface-naming.rules):

Copy
Copied!
            

# Auto-generated by nic-configuration-operator # Do not edit manually SUBSYSTEM=="net", ACTION=="add", KERNELS=="0000:1a:00.0", NAME="net_0_0_0" SUBSYSTEM=="net", ACTION=="add", KERNELS=="0000:1a:00.1", NAME="net_0_1_0" SUBSYSTEM=="net", ACTION=="add", KERNELS=="0000:3a:00.0", NAME="net_1_0_1" SUBSYSTEM=="net", ACTION=="add", KERNELS=="0000:3a:00.1", NAME="net_1_1_1"

Example generated udev rules for RDMA devices (/etc/udev/rules.d/10-nic-rdma-interface-naming.rules):

Copy
Copied!
            

# Auto-generated by nic-configuration-operator # Do not edit manually ACTION=="add", KERNELS=="0000:1a:00.0", SUBSYSTEM=="infiniband", RUN+="/usr/bin/rdma dev set %k name rdma_0_0_0" ACTION=="add", KERNELS=="0000:1a:00.1", SUBSYSTEM=="infiniband", RUN+="/usr/bin/rdma dev set %k name rdma_0_1_0" ACTION=="add", KERNELS=="0000:3a:00.0", SUBSYSTEM=="infiniband", RUN+="/usr/bin/rdma dev set %k name rdma_1_0_1" ACTION=="add", KERNELS=="0000:3a:00.1", SUBSYSTEM=="infiniband", RUN+="/usr/bin/rdma dev set %k name rdma_1_1_1"

The following validation rules are enforced by the API:

  • Spectrum-X optimizations can only be enabled when linkType is Ethernet and numVfs is 1.

  • Spectrum-X optimizations can only be enabled for ConnectX-8 (nicType: 1023) or BlueField-3 SuperNIC (nicType: a2dc).

  • When Spectrum-X optimizations are enabled, roceOptimized must not be enabled (RoCE settings are included in the Spectrum-X configuration).

  • When Spectrum-X optimizations are enabled, rawNvConfig must be empty.

  • When multiplaneMode is none, numberOfPlanes must be 1.

  • When multiplaneMode is not none, numberOfPlanes must not be 1.

  • When version is RA1.3 or RA2.0, multiplaneMode must be none and numberOfPlanes must be 1.

  • The hwplb multiplane mode can only be enabled for ConnectX-8 (nicType: 1023).

The Spectrum-X configuration parameters depend on the Reference Architecture version. The operator applies the following NVConfig and runtime parameters based on the selected version.

When spectrumXOptimized.enabled == true and spectrumXOptimized.version == “RA2.1” the following configuration parameters are applied:

Copy
Copied!
            

swplb: 2: - name: Number of PFs value: 2 dmsPath: /nvidia/physical-functions/config/num-of-pf valueType: int - name: Number of Planes value: 0 mlxconfig: "NUM_OF_PLANES_P1" - name: LAG Resource Allocation value: 0 mlxconfig: "LAG_RESOURCE_ALLOCATION" # CX8-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..3" alternativeValue: "[0,1,2,3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 2 value: "4..7" alternativeValue: "[4,5,6,7]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 255 value: "8..15" alternativeValue: "[8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "1023" # BF3-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..1" alternativeValue: "[0,1]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 2 value: "2..3" alternativeValue: "[2,3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 255 value: "4..15" alternativeValue: "[4,5,6,7,8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "a2dc" 4: - name: Number of PFs value: 4 dmsPath: /nvidia/physical-functions/config/num-of-pf valueType: int - name: Number of Planes value: 0 mlxconfig: "NUM_OF_PLANES_P1" - name: LAG Resource Allocation value: 0 mlxconfig: "LAG_RESOURCE_ALLOCATION" # CX8-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..1" alternativeValue: "[0,1]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 2 value: "2..3" alternativeValue: "[2,3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 3 value: "4..5" alternativeValue: "[4,5]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 4 value: "6..7" alternativeValue: "[6,7]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 255 value: "8..15" alternativeValue: "[8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "1023" # BF3-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..0" alternativeValue: "[0]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 2 value: "1..1" alternativeValue: "[1]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 3 value: "2..2" alternativeValue: "[2]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=3]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 4 value: "3..3" alternativeValue: "[3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=4]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 255 value: "4..15" alternativeValue: "[4,5,6,7,8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "a2dc" hwplb: 2: - name: Number of PFs value: 2 dmsPath: /nvidia/physical-functions/config/num-of-pf valueType: int - name: Number of Planes value: 2 mlxconfig: "NUM_OF_PLANES_P1" - name: LAG Resource Allocation value: 1 mlxconfig: "LAG_RESOURCE_ALLOCATION" # CX8-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..7" alternativeValue: "[0,1,2,3,4,5,6,7]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 255 value: "8..15" alternativeValue: "[8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "1023" 4: - name: Number of PFs value: 4 dmsPath: /nvidia/physical-functions/config/num-of-pf valueType: int - name: Number of Planes value: 4 mlxconfig: "NUM_OF_PLANES_P1" - name: LAG Resource Allocation value: 1 mlxconfig: "LAG_RESOURCE_ALLOCATION" # CX8-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..7" alternativeValue: "[0,1,2,3,4,5,6,7]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 255 value: "8..15" alternativeValue: "[8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "1023" uniplane: 2: - name: Number of PFs value: 2 dmsPath: /nvidia/physical-functions/config/num-of-pf valueType: int - name: Number of Planes P1 value: 0 mlxconfig: "NUM_OF_PLANES_P1" - name: Number of Planes P2 value: 0 mlxconfig: "NUM_OF_PLANES_P2" - name: LAG Resource Allocation value: 0 mlxconfig: "LAG_RESOURCE_ALLOCATION" # CX8-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..3" alternativeValue: "[0,1,2,3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 2 value: "4..7" alternativeValue: "[4,5,6,7]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "1023" - name: Lanes for Module 0 and Port 255 value: "8..15" alternativeValue: "[8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "1023" # BF3-specific parameters - name: Lanes for Module 0 and Port 1 value: "0..1" alternativeValue: "[0,1]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=1]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 2 value: "2..3" alternativeValue: "[2,3]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=2]/lanes deviceId: "a2dc" - name: Lanes for Module 0 and Port 255 value: "4..15" alternativeValue: "[4,5,6,7,8,9,10,11,12,13,14,15]" valueType: string dmsPath: /nvidia/device/config/module[module-id=0]/port[port-id=255]/lanes deviceId: "a2dc" nvConfig: - name: Ethernet mode value: 2 mlxconfig: "LINK_TYPE_P1" - name: Enable SR-IOV value: 1 mlxconfig: "SRIOV_EN" - name: Set NUM_OF_VFS value: 1 mlxconfig: "NUM_OF_VFS" - name: NIC mode value: NIC dmsPath: /nvidia/mode/config/mode valueType: string deviceId: "a2dc" - name: RoCE Adaptive Routing value: true dmsPath: /nvidia/roce/config/adaptive-routing valueType: bool - name: Programmable Congestion Control value: true dmsPath: /nvidia/cc/config/user-programmable valueType: bool - name: RoCE TX Scheduling Locality Mode value: TX_SCHED_LOCALITY_ACCUMULATIVE dmsPath: /nvidia/roce/config/tx-sched-locality-mode valueType: string - name: RoCE CC Steering Ext value: ENABLED dmsPath: /nvidia/roce/config/cc-steering-ext valueType: string # swplb/uniplane/none-specific settings (via DMS) - name: CNP DSCP value: 0 dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp valueType: int multiplane: swplb - name: CNP DSCP value: 0 dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp valueType: int multiplane: uniplane - name: CNP DSCP value: 0 dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp valueType: int multiplane: none - name: CNP DSCP mode value: RTT_RESP_DSCP_DEFAULT dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode valueType: string multiplane: swplb - name: CNP DSCP mode value: RTT_RESP_DSCP_DEFAULT dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode valueType: string multiplane: uniplane - name: CNP DSCP mode value: RTT_RESP_DSCP_DEFAULT dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode valueType: string multiplane: none - name: RoCE Multipath DSCP value: MULTIPATH_DSCP_DEFAULT dmsPath: /nvidia/roce/config/multipath-dscp valueType: string alternativeValue: "unknown" ignoreError: true multiplane: swplb - name: RoCE Multipath DSCP value: MULTIPATH_DSCP_DEFAULT dmsPath: /nvidia/roce/config/multipath-dscp valueType: string alternativeValue: "unknown" ignoreError: true multiplane: uniplane - name: RoCE Multipath DSCP value: MULTIPATH_DSCP_DEFAULT dmsPath: /nvidia/roce/config/multipath-dscp valueType: string alternativeValue: "unknown" ignoreError: true multiplane: none # hwplb-specific settings (via mlxconfig) - name: CNP DSCP value: 48 mlxconfig: "ROCE_RTT_RESP_DSCP_P1" multiplane: hwplb - name: CNP DSCP mode value: 1 mlxconfig: "ROCE_RTT_RESP_DSCP_MODE_P1" multiplane: hwplb - name: Flex Parser Profile value: 10 mlxconfig: "FLEX_PARSER_PROFILE_ENABLE" multiplane: hwplb - name: Disable RDE value: 1 mlxconfig: "RDE_DISABLE" multiplane: hwplb - name: VF LOG BAR size value: 5 mlxconfig: "VF_LOG_BAR_SIZE" multiplane: hwplb runtimeConfig: roce: - name: Trust value: dscp dmsPath: /interfaces/interface/nvidia/qos/config/trust-mode valueType: string alternativeValue: QOS_TRUST_MODE_DSCP - name: PFC value: "00010000" dmsPath: /interfaces/interface/nvidia/qos/config/pfc valueType: string # TODO: figure out if NIC operator or RDMA cni needs to set the tos # - name: Type of Service # value: 96 # dmsPath: /interfaces/interface/nvidia/roce/config/tos # valueType: int adaptiveRouting: - name: Enable CC per plane value: true dmsPath: /interfaces/interface/nvidia/roce/config/cc-per-plane valueType: bool multiplane: hwplb - name: Adaptive Retransmission value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-retransmission valueType: bool - name: Tx Window value: true dmsPath: /interfaces/interface/nvidia/roce/config/tx-window valueType: bool - name: Slow Restart value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart valueType: bool - name: Slow Restart Idle value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart-idle valueType: bool - name: CC Probe MP mode value: true dmsPath: /interfaces/interface/nvidia/roce/config/cc-probe-mp-mode valueType: bool multiplane: hwplb - name: Adaptive Routing Force value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-routing-force valueType: bool congestionControl: - name: Congestion Control on RP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/rp_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control on NP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/np_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/enabled valueType: bool - name: Congestion Control with Counters value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/counter_enable valueType: bool - name: DCQCN value: false dmsPath: /interfaces/interface/nvidia/cc/slot[id=15]/config/enabled valueType: bool # Bandwidth configuration for different cases - name: Bandwidth value: 400 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int deviceId: "1023" breakout: 2 - name: Bandwidth value: 200 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int deviceId: "1023" breakout: 4 - name: Bandwidth value: 200 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int deviceId: "a2dc" breakout: 2 - name: Bandwidth value: 100 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int deviceId: "a2dc" breakout: 4 - name: Responsiveness Alpha Factor value: 6553 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=1]/config/value valueType: int - name: Maximum Decrease Factor value: 63570 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=2]/config/value valueType: int - name: Maximum Increase Factor value: 69468 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=3]/config/value valueType: int - name: Additive Increase Step Size value: 96 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=4]/config/value valueType: int - name: High Additive Increase Step Size value: 1700 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=5]/config/value valueType: int - name: High Additive Increase Interval Period value: 2000000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=6]/config/value valueType: int - name: ZTR_CC_CONGESTION_DELAY_THRESHOLD value: 13000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=7]/config/value valueType: int - name: Maximum Queuing Delay value: 250000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=8]/config/value valueType: int - name: Rate on First Congestion value: 524288 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=9]/config/value valueType: int - name: Delay Only value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=10]/config/value valueType: int - name: CNP Validity value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=11]/config/value valueType: int - name: Transmit Rate Decrement Step value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=12]/config/value valueType: int - name: Fixed Transmission Rate value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=13]/config/value valueType: int - name: Fast Scheduling Factor value: 2097152 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=14]/config/value valueType: int - name: Topology Awareness value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=15]/config/value valueType: int - name: Advanced Features value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=16]/config/value valueType: int - name: Troubleshooting Capabilities value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=17]/config/value valueType: int - name: CC_FIXED_CWND value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=18]/config/value valueType: int - name: Enable CC Plane Failure Detection value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=19]/config/value valueType: int - name: CC Plane Failure Threshold value: 3 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=20]/config/value valueType: int - name: CC Plane Recovery Threshold value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=21]/config/value valueType: int interPacketGap: pureL3: - name: Inter Packet Gap for no overlay value: 25 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int l3EVPN: - name: Inter Packet Gap for L3 EVPN overlay value: 33 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int docaCCVersion: 3.3.0 useSoftwareCCAlgorithm: true

When spectrumXOptimized.enabled == true and spectrumXOptimized.version == “RA2.0” the following configuration parameters are applied:

Copy
Copied!
            

- name: Ethernet mode value: 2 mlxconfig: "LINK_TYPE_P1" - name: Enable SR-IOV value: 1 mlxconfig: "SRIOV_EN" - name: Set NUM_OF_VFS value: 1 mlxconfig: "NUM_OF_VFS" - name: NIC mode value: NIC dmsPath: /nvidia/mode/config/mode valueType: string deviceId: "a2dc" - name: RoCE Adaptive Routing value: true dmsPath: /nvidia/roce/config/adaptive-routing valueType: bool - name: Programmable Congestion Control value: true dmsPath: /nvidia/cc/config/user-programmable valueType: bool - name: RoCE TX Scheduling Locality Mode value: TX_SCHED_LOCALITY_ACCUMULATIVE dmsPath: /nvidia/roce/config/tx-sched-locality-mode valueType: string - name: RoCE Multipath DSCP value: MULTIPATH_DSCP_DEFAULT dmsPath: /nvidia/roce/config/multipath-dscp valueType: string - name: CNP DSCP value: 0 dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp valueType: int - name: CNP DSCP mode value: RTT_RESP_DSCP_DEFAULT dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode valueType: string - name: RoCE CC Steering Ext value: ENABLED dmsPath: /nvidia/roce/config/cc-steering-ext valueType: string runtimeConfig: roce: - name: Trust value: dscp dmsPath: /interfaces/interface/nvidia/qos/config/trust-mode valueType: string alternativeValue: QOS_TRUST_MODE_DSCP - name: PFC value: "00010000" dmsPath: /interfaces/interface/nvidia/qos/config/pfc valueType: string # TODO: figure out if NIC operator or RDMA cni needs to set the tos # - name: Type of Service # value: 96 # dmsPath: /interfaces/interface/nvidia/roce/config/tos # valueType: int adaptiveRouting: - name: Adaptive Retransmission value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-retransmission valueType: bool - name: Tx Window value: true dmsPath: /interfaces/interface/nvidia/roce/config/tx-window valueType: bool - name: Slow Restart value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart valueType: bool - name: Slow Restart Idle value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart-idle valueType: bool - name: Adaptive Routing Force value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-routing-force valueType: bool congestionControl: - name: Congestion Control on RP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/rp_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control on NP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/np_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/enabled valueType: bool - name: Congestion Control with Counters value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/counter_enable valueType: bool - name: DCQCN value: false dmsPath: /interfaces/interface/nvidia/cc/slot[id=15]/config/enabled valueType: bool - name: Bandwidth value: 400 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int - name: Responsiveness Alpha Factor value: 6553 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=1]/config/value valueType: int - name: Maximum Decrease Factor value: 63570 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=2]/config/value valueType: int - name: Maximum Increase Factor value: 69468 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=3]/config/value valueType: int - name: Additive Increase Step Size value: 36 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=4]/config/value valueType: int - name: High Additive Increase Step Size value: 1200 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=5]/config/value valueType: int - name: High Additive Increase Interval Period value: 7000000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=6]/config/value valueType: int - name: Base Round Trip Time value: 15000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=7]/config/value valueType: int - name: Maximum Queuing Delay value: 250000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=8]/config/value valueType: int - name: Rate on First Congestion value: 524288 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=9]/config/value valueType: int - name: Delay Only value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=10]/config/value valueType: int - name: CNP Validity value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=11]/config/value valueType: int - name: Transmit Rate Decrement Step value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=12]/config/value valueType: int - name: Fixed Transmission Rate value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=13]/config/value valueType: int - name: Fast Scheduling Factor value: 2097152 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=14]/config/value valueType: int - name: Topology Awareness value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=15]/config/value valueType: int - name: Advanced Features value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=16]/config/value valueType: int - name: Troubleshooting Capabilities value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=17]/config/value valueType: int interPacketGap: pureL3: - name: Inter Packet Gap for no overlay value: 25 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int - name: Shut down interface value: false dmsPath: /interfaces/interface/config/enabled valueType: bool - name: Bring up interface to apply IPG settings value: false dmsPath: /interfaces/interface/config/enabled valueType: bool l3EVPN: - name: Inter Packet Gap for L3 EVPN overlay value: 33 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int - name: Shut down interface value: false dmsPath: /interfaces/interface/config/enabled valueType: bool - name: Bring up interface to apply IPG settings value: false dmsPath: /interfaces/interface/config/enabled valueType: bool docaCCVersion: 3.1.0 useSoftwareCCAlgorithm: true

When spectrumXOptimized.enabled == true and spectrumXOptimized.version == “RA1.3” the following configuration parameters are applied:

Copy
Copied!
            

- name: Ethernet mode value: 2 mlxconfig: "LINK_TYPE_P1" - name: Enable SR-IOV value: 1 mlxconfig: "SRIOV_EN" - name: Set NUM_OF_VFS value: 1 mlxconfig: "NUM_OF_VFS" - name: NIC mode value: NIC dmsPath: /nvidia/mode/config/mode valueType: string deviceId: "a2dc" - name: RoCE Adaptive Routing value: true dmsPath: /nvidia/roce/config/adaptive-routing valueType: bool - name: Programmable Congestion Control value: true dmsPath: /nvidia/cc/config/user-programmable valueType: bool - name: RoCE TX Scheduling Locality Mode value: TX_SCHED_LOCALITY_ACCUMULATIVE dmsPath: /nvidia/roce/config/tx-sched-locality-mode valueType: string - name: RoCE Multipath DSCP value: MULTIPATH_DSCP_DEFAULT dmsPath: /nvidia/roce/config/multipath-dscp valueType: string - name: CNP DSCP value: 0 dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp valueType: int - name: CNP DSCP mode value: RTT_RESP_DSCP_DEFAULT dmsPath: /interfaces/interface/nvidia/roce/config/rtt-resp-dscp-mode valueType: string - name: RoCE CC Steering Ext value: ENABLED dmsPath: /nvidia/roce/config/cc-steering-ext valueType: string runtimeConfig: roce: - name: Trust value: dscp dmsPath: /interfaces/interface/nvidia/qos/config/trust-mode valueType: string alternativeValue: QOS_TRUST_MODE_DSCP - name: PFC value: "00010000" dmsPath: /interfaces/interface/nvidia/qos/config/pfc valueType: string # TODO: figure out if NIC operator or RDMA cni needs to set the tos # - name: Type of Service # value: 96 # dmsPath: /interfaces/interface/nvidia/roce/config/tos # valueType: int adaptiveRouting: - name: Adaptive Retransmission value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-retransmission valueType: bool - name: Tx Window value: true dmsPath: /interfaces/interface/nvidia/roce/config/tx-window valueType: bool - name: Slow Restart value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart valueType: bool - name: Slow Restart Idle value: false dmsPath: /interfaces/interface/nvidia/roce/config/slow-restart-idle valueType: bool - name: Adaptive Routing Force value: true dmsPath: /interfaces/interface/nvidia/roce/config/adaptive-routing-force valueType: bool congestionControl: - name: Congestion Control on RP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/rp_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control on NP points value: true dmsPath: /interfaces/interface/nvidia/cc/config/priority/np_enabled # priority[id=0..7] valueType: bool alternativeValue: "1" - name: Congestion Control value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/enabled valueType: bool - name: Congestion Control with Counters value: true dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/config/counter_enable valueType: bool - name: DCQCN value: false dmsPath: /interfaces/interface/nvidia/cc/slot[id=15]/config/enabled valueType: bool - name: Bandwidth value: 400 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=0]/config/value valueType: int - name: Responsiveness Alpha Factor value: 6553 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=1]/config/value valueType: int - name: Maximum Decrease Factor value: 63570 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=2]/config/value valueType: int - name: Maximum Increase Factor value: 69468 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=3]/config/value valueType: int - name: Additive Increase Step Size value: 36 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=4]/config/value valueType: int - name: High Additive Increase Step Size value: 1200 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=5]/config/value valueType: int - name: High Additive Increase Interval Period value: 7000000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=6]/config/value valueType: int - name: Base Round Trip Time value: 15000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=7]/config/value valueType: int - name: Maximum Queuing Delay value: 250000 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=8]/config/value valueType: int - name: Rate on First Congestion value: 524288 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=9]/config/value valueType: int - name: Delay Only value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=10]/config/value valueType: int - name: CNP Validity value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=11]/config/value valueType: int - name: Transmit Rate Decrement Step value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=12]/config/value valueType: int - name: Fixed Transmission Rate value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=13]/config/value valueType: int - name: Fast Scheduling Factor value: 2097152 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=14]/config/value valueType: int - name: Topology Awareness value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=15]/config/value valueType: int - name: Advanced Features value: 1 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=16]/config/value valueType: int - name: Troubleshooting Capabilities value: 0 dmsPath: /interfaces/interface/nvidia/cc/slot[id=0]/param[id=17]/config/value valueType: int interPacketGap: pureL3: - name: Inter Packet Gap for no overlay value: 25 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int - name: Shut down interface value: false dmsPath: /interfaces/interface/config/enabled valueType: bool - name: Bring up interface to apply IPG settings value: false dmsPath: /interfaces/interface/config/enabled valueType: bool l3EVPN: - name: Inter Packet Gap for L3 EVPN overlay value: 33 dmsPath: /interfaces/interface/ethernet/nvidia/config/inter-packet-gap valueType: int - name: Shut down interface value: false dmsPath: /interfaces/interface/config/enabled valueType: bool - name: Bring up interface to apply IPG settings value: false dmsPath: /interfaces/interface/config/enabled valueType: bool docaCCVersion: 3.1.0 useSoftwareCCAlgorithm: true

Previous Configuration Details
Next Network Operator API reference v1alpha1
© Copyright 2025-2026, NVIDIA. Last updated on Feb 26, 2026