NVIDIA Network Operator v26.4.0

Configuration Reference

The Launch Kit configuration file (typically cluster-config.yaml, produced by l8k discover and consumed by l8k generate) is YAML. This page documents every top-level section.

Copy
Copied!
            

networkOperator: selectedRelease: "26.4" version: v26.4.0 componentVersion: network-operator-v26.4.0 repository: nvcr.io/nvidia/cloud-native namespace: nvidia-network-operator imagePullSecrets: [] docsBaseURL: https://docs.nvidia.com/networking/display/kubernetes2610 docaDriver: enable: true version: doca3.4.0-26.04-0.8.6.0-0 unloadStorageModules: false enableNFSRDMA: false unloadThirdPartyRDMAModules: false skipPreflightChecks: false nvIpam: poolName: nv-ipam-pool startingSubnet: "192.168.2.0" mask: 24 offset: 1 sriov: ethernetMtu: 9000 infinibandMtu: 4000 numVfs: 8 priority: 90 resourceName: sriov_resource networkName: sriov-network hostdev: resourceName: hostdev-resource networkName: hostdev-network rdmaShared: resourceName: rdma_shared_resource hcaMax: 63 ipoib: networkName: ipoib-network macvlan: networkName: macvlan-network nicConfigurationOperator: deployNicInterfaceNameTemplate: true rdmaPrefix: "rdma_r%rail%" netdevPrefix: "eth_r%rail%" spectrumX: nicType: "1023" overlay: "none" rdmaPrefix: "roce_p%plane%_r%rail%" netdevPrefix: "eth_p%plane%_r%rail%" workload: manifest: "" profile: fabric: ethernet deployment: sriov multirail: false spectrumX: spcxVersion: "RA2.1" multiplaneMode: swplb numberOfPlanes: 4 ai: false clusterConfig: - identifier: "dgx-b200-nvidia-b200" machineType: "DGX-B200" productType: "NVIDIA-B200" capabilities: nodes: sriov: true rdma: true ib: false workerNodes: ["worker-0", "worker-1"] nodeSelector: nvidia.kubernetes-launch-kit.machine: "DGX-B200-NVIDIA-B200" thirdPartyRDMAModules: [] storageModules: [] linkType: Ethernet pfs: - deviceID: "1023" pciAddress: "0000:05:00.0" rdmaDevice: "mlx5_0" networkInterface: "net1" traffic: east-west rail: 0

Network Operator version, image registry, namespace, and pull secrets.

Field

Description

selectedRelease Pin to a release line. Supported: 25.10, 26.1, 26.4. Auto-fills version and image tags from an embedded catalog. Equivalent to the --network-operator-release flag.
version Explicit Network Operator version. Overrides the catalog when set.
componentVersion Tag for component images (CNI, device plugins, etc.).
repository Container registry (default: nvcr.io/nvidia/mellanox).
namespace Operator namespace (default: nvidia-network-operator).
imagePullSecrets List of secret names. Propagated to NicClusterPolicy.spec.global.imagePullSecrets and per-group NicNodePolicy sub-specs.
docsBaseURL Documentation URL embedded in generated annotations.

OFED driver configuration and kernel driver dependencies validation.

Field

Description

enable Include the OFED driver in generated manifests. Set to false to skip (or use --enable-doca-driver to flip).
version DOCA driver version tag.
unloadStorageModules Unload storage-over-RDMA modules (nvme_rdma, ib_isert, rpcrdma, …). Auto-set to true during discovery if such modules are detected.
unloadThirdPartyRDMAModules Unload third-party RDMA modules (rdma_rxe, qedr, bnxt_re, …). Auto-set to true during discovery if such modules are detected. Storage and third-party module lists are sourced from the doca-driver-build project.
enableNFSRDMA Enable NFS-over-RDMA support.
skipPreflightChecks Skip the kernel driver dependencies validation. Useful for environments where it’s known-good.

See Discover Workflow for how OFED-dependent modules are detected.

NV-IPAM configuration. Either provide an explicit subnets list or let Launch Kit auto-generate non-overlapping subnets per node group.

Field

Description

poolName Pool name used in IPPool CRs.
subnets Explicit list of {subnet, gateway} entries. Mutually exclusive with the auto-generation fields.
startingSubnet First subnet for auto-generation (e.g., 192.168.2.0).
mask Prefix length for auto-generated subnets.
offset Increment used between auto-generated subnets.

Profile-specific parameters — only the section for the selected profile is consumed.

Section

Field

Description

sriov ethernetMtu / infinibandMtu MTU values per fabric.
sriov numVfs Number of virtual functions per PF.
sriov priority SriovNetworkNodePolicy priority.
sriov resourceName / networkName Kubernetes resource and network names.
hostdev resourceName / networkName Kubernetes resource and network names for host-device.
rdmaShared resourceName Kubernetes resource name.
rdmaShared hcaMax Maximum HCAs per host (soft limit).
ipoib networkName IPoIB network name.
macvlan networkName MacVLAN network name.

Controls when NIC interface names are templated by the NIC Configuration Operator.

Field

Description

deployNicInterfaceNameTemplate “Enable when needed”. Templates are deployed when groups have cross-rail PCI conflicts or when names are otherwise ambiguous. See Heterogeneous Clusters.
rdmaPrefix RDMA device naming template (default: rdma_r%rail%).
netdevPrefix Netdev naming template (default: eth_r%rail%).

Spectrum-X-specific settings.

Field

Description

nicType NIC type device ID. 1021 = ConnectX-7 NIC; 1023 = ConnectX-8 SuperNIC; a2dc = BlueField-3 SuperNIC.
overlay Overlay mode.
rdmaPrefix RDMA device naming template with %plane% and %rail% substitutions.
netdevPrefix Netdev naming template with %plane% and %rail% substitutions.

Field

Description

manifest Path to a custom workload manifest. When set, Launch Kit patches it with network annotations, resource requests, and node affinity instead of generating an example DaemonSet. See Generate Workflow.

Profile selection (also overridable via CLI flags).

Field

Description

fabric ethernet or infiniband.
deployment sriov, rdma_shared, or host_device.
multirail Enable multirail.
spectrumX.spcxVersion Spectrum-X reference architecture (RA2.1 or RA2.2).
spectrumX.multiplaneMode Multiplane mode: hwplb, swplb, uniplane, none.
spectrumX.numberOfPlanes Number of planes.

Discovered node groups, populated by l8k discover. Each entry describes one group.

Field

Description

identifier Sanitised <machineType>-<gpuType> (e.g. dgx-b200-nvidia-h100-nvl) when both fields are resolved; group-0 / group-1 fallback when they aren’t. Used as the NicNodePolicy / SriovNetworkNodePolicy name suffix.
machineType / productType Hardware type strings (e.g., DGX-B200 / NVIDIA-B200).
capabilities.nodes.sriov / rdma / ib Boolean flags reflecting hardware capability.
workerNodes List of node names in this group.
nodeSelector Per-group selector. After l8k discover, this is {nvidia.kubernetes-launch-kit.machine: <machineType>-<gpuType>} — a label discovery writes onto every node in the group. When l8k generate auto-merges groups sharing a GPU type, the merged group falls back to {nvidia.kubernetes-launch-kit.gpu: <gpuType>} instead (different source machineTypes can’t share a single machine label). Discovery writes both labels onto every node, so the merged selector has a value to bind to. Configs from earlier l8k versions with old-style differential nodeSelectors are preserved as-is.
thirdPartyRDMAModules / storageModules OFED-dependent modules detected on the group.
presetApplied true when a topology preset matched (machineType, gpuType) and was applied.
presetDeviation List of field-level discrepancies between the matched preset and discovered hardware. Non-empty means the preset was applied but the cluster differs from the preset. Each entry has field (pciAddress / deviceID), expected, got, and detail. See Cluster Topology Presets “Validation and Deviations”.
linkType The discovered fabric for the group: Ethernet or InfiniBand. Populated by the fabric probe only when every east-west port produces a confirmed verdict (port ACTIVE plus, for IB, a subnet manager is present) and they agree. When omitted, discovery couldn’t prove the cluster’s fabric — downstream code should treat the absence as “unknown”. See Discover Workflow “Fabric Type Detection”.
pfs List of physical functions. Each entry has deviceID, pciAddress, rdmaDevice, networkInterface, traffic (east-west or north-south), and rail (sequential index for east-west PFs).

North-south PFs are listed for visibility but filtered out of generated manifests. See Overview and Discover Workflow.

Previous CLI Reference
Next AI Skills
© Copyright 2025-2026, NVIDIA. Last updated on Jun 14, 2026