NVIDIA Network Operator v25.7.0

NIC Firmware Configuration

NVIDIA NIC Configuration Operator provides Kubernetes API (Custom Resource Definition) to allow Firmware update and configuration on NVIDIA NICs in a coordinated manner. It deploys a configuration daemon on each of the desired nodes to configure NVIDIA NICs there. NVIDIA NIC Configuration Operator uses Maintenance Operator to prepare a node for maintenance before the actual configuration. .. warning:: NVIDIA NIC Configuration Operator does not support FW reset flow for DPU mode. Check limitations.

For more information about the CRD API, refer to CRD API Reference.

Note

To perform Firmware validation and update on NIC devices, NIC Configuration Operator requires a persistent storage set up in the cluster. To set up a persistent NFS storage in the cluster, the example from the CSI NFS Driver repository might be used. After deploying the NFS server and NFS CSI driver, the storage class should become available in the cluster. The name of the storage class should then be passed when configuring the NIC Configuration Operator.

First install the Network Operator helm chart with the Maintenance Operator enabled and deploy a NIC Cluster Policy CRD with NIC Configuration Operator enabled:

values.yaml:

Copy
Copied!
            

maintenanceOperator: enabled: true

nicclusterpolicy.yaml:

Copy
Copied!
            

apiVersion: mellanox.com/v1alpha1 kind: NicClusterPolicy metadata: name: nic-cluster-policy spec: nicConfigurationOperator: operator: image: nic-configuration-operator repository: nvcr.io/nvidia/mellanox version: network-operator-v25.7.0 configurationDaemon: image: nic-configuration-operator-daemon repository: nvcr.io/nvidia/mellanox version: network-operator-v25.7.0 nicFirmwareStorage: create: true pvcName: nic-fw-storage-pvc # Name of the storage class is provided by the user storageClassName: nfs-csi availableStorageSize: 1Gi

Observe the NicDevice CRs detected in the cluster. The name of the CR is composed from the node name, NIC type and its serial number:

Copy
Copied!
            

> kubectl get nicdevices -n nvidia-network-operator NAME AGE node1-1015-mt1627x08307 1m node1-101d-mt1952x03330 1m node2-1015-mt1627x08305 1m node2-101d-mt1952x03327 1m

Discover more information about a specific device:

Copy
Copied!
            

kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o yaml

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicDevice metadata: creationTimestamp: "2024-09-21T08:43:08Z" generation: 1 name: node1-101d-mt1952x03327 namespace: nvidia-network-operator ownerReferences: - apiVersion: v1 kind: Node name: node1 uid: 25c4f4e2-f7ba-4ba9-9a87-8056313ffc79 resourceVersion: "1177095" uid: ac6763bf-67c6-4af5-81f8-1aad5da929bf spec: {} status: conditions: - type: FirmwareUpdateInProgress status: "False" reason: DeviceFirmwareSpecEmpty message: Device firmware spec is empty, cannot update or validate firmware lastTransitionTime: "2024-09-21T08:43:04Z" - type: ConfigUpdateInProgress status: "False" reason: DeviceConfigSpecEmpty message: Device configuration spec is empty, cannot update configuration lastTransitionTime: "2024-09-21T08:43:08Z" firmwareVersion: 22.39.1015 node: cloud-dev-41 partNumber: mcx623106ac-cdat ports: - networkInterface: enp3s0f0np0 pci: "0000:03:00.0" rdmaInterface: mlx5_0 - networkInterface: enp3s0f1np1 pci: "0000:03:00.1" rdmaInterface: mlx5_1 psid: mt_0000000436 serialNumber: mt1952x03327 type: 101d

Configure and apply the NICFirmwareSource CR

Deploy the NICFirmwareSource CR:

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicFirmwareSource metadata: name: connectx6-dx-firmware-22-44-1036 namespace: nvidia-network-operator finalizers: - configuration.net.nvidia.com/nic-configuration-operator spec: # a list of firmware binaries zip archives from the Mellanox website, can point to any URL accessible from the cluster binUrlSources: - https://www.mellanox.com/downloads/firmware/fw-ConnectX6Dx-rel-22_44_1036-MCX623106AC-CDA_Ax-UEFI-14.37.14-FlexBoot-3.7.500.signed.bin.zip # a URL to the BlueField Bundle (BFB) file, can point to any URL accessible from the cluster bfbUrlSource: - https://example.com/bf-fwbundle-3.1.0-77_25.07-prod.bfb

Note

The ConnectX firmware binaries can be downloaded from the NVIDIA Networking Firmware Downloads page. The URLs of the firmware binaries from the website can be directly provided in the binUrlSources field of the NicFirmwareSource CR.

Note

BlueField Bundle (BFB) can be downloaded from the NVIDIA DOCA Downloads page. The file should first be made available in the cluster and then its URL should be provided in the bfbUrlSource field of the NicFirmwareSource CR.

Observe the NICFirmwareSource status:

Copy
Copied!
            

> kubectl get nicfirmwaresource -n nvidia-network-operator connectx6-dx-firmware-22-44-1036 -o yaml ... status: state: Success versions: 22.44.1036: - mt_0000000436

Configure and apply the NicFirmwareTemplate CR

Configure and apply the NicFirmwareTemplate CR:

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicFirmwareTemplate metadata: name: connectx6dx-config namespace: nvidia-network-operator spec: nodeSelector: kubernetes.io/hostname: node1 nicSelector: nicType: "101d" template: nicFirmwareSourceRef: connectx6dx-firmware-22-44-1036 updatePolicy: Update

Spec of the NicDevice CR is updated in accordance with the NICFirmwareTemplate and NicConfigurationTemplate CRs matching the device

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.spec}' | yq -P template: firmware: nicFirmwareSourceRef: connectx6dx-firmware-22-44-1036 updatePolicy: Update

Status conditions of the NicDevice CR reflect the status of the firmware update and indicate any errors that might occur during the process

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.status.conditions}' | yq -P - type: FirmwareUpdateInProgress status: "False" reason: DeviceFirmwareConfigMatch message: Firmware matches the requested version observedGeneration: 4 lastTransitionTime: "2024-09-21T08:42:23Z"

NIC Firmware Mismatch Notification

NIC Configuration Operator updates status conditions of the NicDevice CR to set FirmwareConfigMatch condition based on a current NIC firmware:

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.status.conditions}' | yq -P - type: FirmwareConfigMatch status: "True" reason: DeviceFirmwareConfigMatch message: Device firmware '20.42.1000' matches to recommended version '20.42.1000' lastTransitionTime: "2024-09-21T08:43:10Z"

FirmwareConfigMatch condition status is set to Unknown if DOCA-OFED Driver is not installed otherwise it notifies if current NIC firmware is recommended or not recommended by DOCA-OFED Driver. E.g.:

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.status.conditions}' | yq -P - type: FirmwareConfigMatch status: "True" reason: DeviceFirmwareConfigMatch message: Device firmware '20.42.1000' matches to recommended version '20.42.1000' lastTransitionTime: "2024-11-08T09:19:41Z"

Configure and apply the NicConfigurationTemplate CR

Copy
Copied!
            

apiVersion: configuration.net.nvidia.com/v1alpha1 kind: NicConfigurationTemplate metadata: name: connectx6-config namespace: nvidia-network-operator spec: nodeSelector: feature.node.kubernetes.io/network-sriov.capable: "true" nicSelector: # nicType selector is mandatory the rest are optional. Only a single type can be specified. nicType: 101d pciAddresses: - "0000:03:00.0" - “0000:04:00.0” serialNumbers: - "mt1952x03327" resetToDefault: false # if set, template is ignored, device configuration should reset template: # numVfs and linkType fields are mandatory, the rest are optional numVfs: 2 linkType: Ethernet pciPerformanceOptimized: enabled: true maxReadRequest: 4096 roceOptimized: enabled: true qos: trust: dscp pfc: "0,0,0,1,0,0,0,0" gpuDirectOptimized: enabled: true env: Baremetal

Note

It’s not possible to apply more than one template of each kind (NICFirmwareTemplate or NICConfigurationTemplate) to a single device. In this case, no template will be applied and an error event will be emitted for the corresponding NicDevice CR.

Note

To use the NIC Configuration Operator functionality together with SR-IOV Network Operator, “mellanox” plugin should be disabled in the SR-IOV Network Operator.

For detailed information about firmware parameters and configuration settings, refer to Configuration Details.

Spec of the NicDevice CR is updated in accordance with the NICFirmwareTemplate and NicConfigurationTemplate CRs matching the device

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.spec}' | yq -P template: firmware: nicFirmwareSourceRef: connectx6dx-firmware-22-44-1036 updatePolicy: Update configuration: numVfs: 2 linkType: Ethernet pciPerformanceOptimized: enabled: true roceOptimized: enabled: true qos: trust: dscp pfc: "0,0,0,1,0,0,0,0" gpuDirectOptimized: enabled: true env: Baremetal

Observe the status of the configuration update

Status conditions of the NicDevice CR reflect the status of the configuration update and indicate any errors that might occur during the process

Copy
Copied!
            

> kubectl get nicdevice -n nvidia-network-operator node1-101d-mt1952x03327 -o jsonpath='{.status.conditions}' | yq -P - type: FirmwareUpdateInProgress status: "False" reason: DeviceFirmwareConfigMatch message: Firmware matches the requested version observedGeneration: 4 lastTransitionTime: "2024-09-21T08:42:23Z" - type: ConfigUpdateInProgress status: "True" reason: UpdateStarted message: "" lastTransitionTime: "2024-09-21T08:43:08Z"

Note

If both Firmware update and configuration are applied to a single device, the firmware update should be performed first. The configuration update will be applied after the firmware update is completed.

Previous NIC Configuration Operator
Next Configuration Details
© Copyright 2025, NVIDIA. Last updated on Aug 26, 2025.