Skip to main content
Ctrl+K
External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide - Home External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide - Home

External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide

External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide - Home External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide - Home

External NVIDIA Mission Control Extended North-South Network GB200 Deployment Guide

Table of Contents

Contents:

  • Introduction
  • Networking Planning and Design
  • Configure Cluster Ethernet Networking
  • Ethernet Cluster Networking Validation
  • NVLink Switch Planning and Design
  • Appendix: Configuration Examples
  • NVLink Switch Planning and Design

NVLink Switch Planning and Design#

Confirm Accurate Hardware Models and Quantities#

  • Kubernetes Admin|User nodes server (3x)

  • NVLink Switches (non-scaleout) (x9 per rack)

  • Transceivers (optical) and cables (DAC / AOC)

  • Fiber and Copper cables

Network and Connectivity Requirements#

Kubernetes Admin|User nodes to BTOR switch connectivity#

  • Transceiver type, compatibility and HW order status

  • Electrical signaling/encoding (NRZ vs PAM4)

  • Speed/Bandwidth

  • IP Addressing

  • Logical Connectivity (Access and Bonded).

NVLink Switch to OOB Switch connectivity#

  • Copper connections

  • Speed/Bandwidth

  • IP Addressing

  • Logical Connectivity (Access)

Routable IP Address Allocation#

  • Kubernetes Admin|User nodes server will have IP addresses allocated from Inband and OOB (ComE0) subnets.

  • NVSwitches will have IP addresses allocated from OOB (ComE0) subnets.

  • Kubernetes Admin|User nodes and NVLink Switch Connectivity - To provision and manage NVLink Switch/NVLink.

  • Use Default Partition

NVLink Switch Connectivity

Figure 14 NVLink Switch Connectivity#

NVLink Switch Design#

This section describes the NVLink Switch design and connectivity.

NVLink Switch Design

Figure 15 NVLink Switch Design#

Physical Connectivity

Reference Design - Connectivity

  • ComE1: 1G connects to within Rack Switch: SN2201 on RU 45

  • ComE2: 1G connects to within Rack Switch: SN2201 on RU 45

BMC: 1G connects to within Rack Switch: SN2201 on RU 44

Network Allocation Breakout

NVLink Switch Network Allocation Breakout

Figure 16 NVLink Switch Network Allocation Breakout#

  • ComE1, ComE2, BMC are part of the same larger OOB subnet which covers 4 x GB200 racks

NVLink Switch Integration with BCM Verification#

The following image describes the workflow of the NVLink Switch Integration with BCM verification:

NVLink Switch Integration Workflow

Figure 17 NVLink Switch Integration with BCM Verification#

NVLink Switch Tray Check and Configuration#

Note

If the NVLink Switch is running a version earlier than 25.02.2134, the custom script and cm-lite-daemon will not be installed. However, once the switch is upgraded to a newer version through ZTP, it will recognize the custom script option and install the cm-lite-daemon. This process will take approximately 2 to 4 minutes longer than a standard ZTP.

To check the NVLink Switch configuration, follow the steps below:

  1. Confirm OOB power control of the NVLink Switches using a power status check in BCM, like what was done for a GB200 compute tray.

  2. Verify that an NVLink Switch can be reached using SSH from the head node through the admin user. The password may have been initially set during the bcm-netautogen process. A typical password is admin.

    ssh admin@<rack location>-NVSW-01
    
  3. Set the password to match what has been put into BCM for the NVLink Switch entry under accesssettings.

  4. Verify that the NVLink Switch parameter settings are correct.

    NVLink Switch Parameter Settings

    Figure 18 NVLink Switch Parameter Settings#

  5. Then check the ZTP settings: cmsh > device > use <nvswitch> ztpsettings

    NVLink Switch ZTP Settings

    Figure 19 NVLink Switch ZTP Settings#

To verify the settings of the NVLink Switch Configuration:

  1. If the NVLink Switch comes pre-installed with NVOS from the factory and the factory password has already been reset, NVOS Zero Touch Provisioning (ZTP) will be disabled. NVOS-ZTP script will be executed once the device is reachable.

  2. Reset NVSwitch System to factory reset:

    nv action reset system factory-default force
    

    NVSwitch will reboot automatic (takes ~2.3mins)

  3. Wait for the switch to come up automatically.

  4. On the backend, the following things will happen:

  5. The nvlink-nvos.json file is pulled and then the system:

    • Validate the format and create folders defined in the section /var/lib/ztp/sections.

  6. Each section is iterated through section by section and the following actions performed:

    • Verify Network Connectivity

    • Upgrade NVOS Image

    • Adjust Security Settings by setting password hardening to disabled.

    • Apply generated startup.yaml and patch configuration

    • Execute nvos-ztp.sh script to install cm-lite-daemon

      This action will initiate the following actions:

      • Pulls packages from BCM

      • Register certs with BCM

      • Enables the service

  7. Once all the NVLink Switches are up within the rack, the system begins the (auto) leader selection process

  8. Out of the 9, select 1 nvswitch and apply fm_config and then enable the cluster.

NVLink Switch Workflow Diagram:#

NVSWITCH - FIRMWARE Upgrade process is separate, not part of ZTP

NVLink Switch Workflow

Figure 20 NVLink Switch Workflow Diagram#

previous

Ethernet Cluster Networking Validation

next

Appendix: Configuration Examples

On this page
  • Confirm Accurate Hardware Models and Quantities
  • Network and Connectivity Requirements
    • Kubernetes Admin|User nodes to BTOR switch connectivity
    • NVLink Switch to OOB Switch connectivity
    • Routable IP Address Allocation
  • NVLink Switch Design
  • NVLink Switch Integration with BCM Verification
  • NVLink Switch Tray Check and Configuration
    • NVLink Switch Workflow Diagram:
NVIDIA NVIDIA

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Sep 30, 2025.