Skip to main content
Ctrl+K
NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide - Home NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide - Home

NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide

NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide - Home NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide - Home

NVIDIA Mission Control DGX SuperPOD Ethernet North-South Network Configuration Guide

Table of Contents

Contents:

  • Introduction
  • Networking Planning and Design
  • How to Format Point-to-Point (P2P)
  • Understanding the Application Workflow
  • BCM Bootstrap Workflow
  • Configure Cluster Ethernet Networking
  • Validate Ethernet Cluster Networking
  • NVLink Switch Planning, Design, and Deployment Guide using Base Command Manager (BCM)
  • Glossary of Terms
  • Appendix
  • Notice
  • NVLink Switch Planning, Design, and Deployment Guide using Base Command Manager (BCM)

NVLink Switch Planning, Design, and Deployment Guide using Base Command Manager (BCM)#

This section provides an overview of the planning, design, and deployment process for NVLink Switch using the BCM.

NVLink Switch Planning#

When planning the NVLink Switch, you need to consider the following:

  • Hardware Components and Models

  • Network and Connectivity Requirements

Hardware Components and Models#

Confirm the you have the accurate hardware models and correct amount. You can reference this list of items:

  • Kubernetes Admin|User nodes Server - 3x

  • NVLink Switches (non-scaleout) - 9x per rack

  • Optical Transceivers and DAC/AOC cables

  • Fiber and Copper cables

Network and Connectivity Requirements#

The following are the key network and connectivity requirements to consider when planning your NVLink Switch deployment:

  • Kubernetes Admin|User nodes to BTOR switch connectivity

    • Transceiver type, compatibility and HW order status

    • Electrical signaling/encoding (NRZ vs PAM4)

    • Speed/Bandwidth

    • IP Addressing

    • Logical Connectivity (Access and Bonded)

  • NVLink Switch to OOB Switch connectivity

    • Copper connections

    • Speed/Bandwidth

    • IP Addressing

    • Logical Connectivity (Access)

  • Routable IP Address Allocation

    • Kubernetes Admin|User nodes server will have IP addresses allocated from Inband and OOB (ComE0) subnets.

    • NVLink Switches will have IP addresses allocated from OOB (ComE0) subnets.

  • Kubernetes Admin|User nodes and NVLink Switch Connectivity - To provision and manage NVLink Switch/NVLink.

  • Use the default partition

_images/ns-nvlinkswitch-plan-01.png

NVLink Switch Design#

This section provides details on the NVLink Switch design, including:

  • Physical connectivity

  • Network allocation diagram

  • NVLink Switch integration with BCM verification

The following diagram shows how the NVLink Switch connects to the Rack Switch using the NIC-COM ports and BMC on the OOB network.

_images/ns-nvlinkswitch-plan-02.png

Physical Connectivity#

The following are the reference design for the NVLink Switch connectivity:

  • ComE1: 1G connects to within Rack Switch: SN2201 on RU 45

  • ComE2: 1G connects to within Rack Switch: SN2201 on RU 45

  • BMC: 1G connects to within Rack Switch: SN2201 on RU 44

Network Allocation Diagram#

This following network workflow diagram shows how the above ComE1, ComE2, and BMC are a part of the larger OOB subnet that covers 4 x GB200 racks:

_images/ns-nvlinkswitch-plan-03.png

NVLink Switch Integration with BCM Verification#

The following image shows the workflow diagram of NVLink Switch integration with BCM:

_images/ns-nvlinkswitch-plan-07.png

Verify the NVLink Switch Configuration#

To verify that the NVLink Switch configuration is correct:

  1. Confirm OOB power control of the NVLink Switches using a power status check in BCM, similar to what was done for a GB200 compute tray.

  2. Verify that an NVLink Switch can be reached through SSH from the head node using the admin user. The password may have been initially set during the bcm-netautogen (also known as netautogen) process. A typical password is admin.

    ssh admin@<rack location>-NVSW-01
    

    Set the password to match what has been put into BCM for the NVLink Switch entry under accesssettings.

  3. Verify that the following NVLink Switch parameters are set correctly:

  • NV configuration mode

  • NV configuration file

  • FM config file

    _images/ns-nvlinkswitch-plan-04.png
  • cmsh > device > use <NVLink Switch> ztpsettings

_images/ns-nvlinkswitch-plan-05.png

Updating the NVLink Switch to the Latest Firmware#

If the NVLink Switch is running a version earlier than v25.02.2134, the custom script and the cm-lite-daemon will not be installed. However, once the NVLink Switch is upgraded to a newer version using ZTP, it will recognize the custom script option and install the cm-lite-daemon. This process will take approximately 2-4 minutes longer than a standard ZTP.

The following flow diagram shows how the NVLink Switch is updated to the latest firmware:

_images/ns-nvlinkswitch-plan-06.png

To update the NVLink Switch to the latest version, you can use the following steps:#

  1. If the NVLink Switch comes pre-installed with NVOS from the factory and the factory password has already been reset, NVOS Zero Touch Provisioning (ZTP) will be disabled.

    • The NVOS-ZTP script will be executed once the device is reachable.

  2. Reset the NVLink Switch System to a factory reset.

    1. nv action reset system factory-default force

    2. The NVLink Switch will reboot automatically (takes ~2.3mins)

  3. Wait for the NVLink Switch to come back online.

  4. On the backend, the following things will happen:

    1. It will pull the nvlink-nvos.json file, validate the format, and create folders per section in /var/lib/ztp/sections.

    2. It then executes each section one by one. For each section, it does the following:

      1. Verifies network connectivity

      2. Upgrades the NVOS image

      3. Adjusts the security settings and disables password hardening

      4. Applies the generated startup.yaml and patch configuration

      5. Executes the nvos-ztp.sh script to install cm-lite-daemon, which does the following:

        • Pulls packages from BCM

        • Registers certificates with BCM

        • Enables the service

  5. After the NVLink Switches are up and running within the rack, the system will automatically perform a Leader selection process

    1. Out of 9, select 1 NVLink Switch:

      1. Apply fm_config

      2. Enable the cluster

  6. Once the cluster is enabled, the NVLink Switch will be ready to use.

previous

Validate Ethernet Cluster Networking

next

Glossary of Terms

On this page
  • NVLink Switch Planning
    • Hardware Components and Models
    • Network and Connectivity Requirements
  • NVLink Switch Design
    • Physical Connectivity
    • Network Allocation Diagram
  • NVLink Switch Integration with BCM Verification
    • Verify the NVLink Switch Configuration
    • Updating the NVLink Switch to the Latest Firmware
      • To update the NVLink Switch to the latest version, you can use the following steps:
NVIDIA NVIDIA

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Sep 26, 2025.