Glossary of Terms#

This glossary contains definitions of key terms and concepts used in the DGX SuperPOD Ethernet North-South Network Configuration Guide.

AOC#

Active Optical Cable - A type of high-speed data transmission cable that uses fiber optic technology with integrated optical transceivers.

B200#

B200 - A DGX system model that is part of the SuperPOD deployment, smaller than the GB200 system.

BCM#

Base Command Manager - NVIDIA’s cluster management software that handles deployment, configuration, and management of DGX SuperPOD systems.

BGP#

Border Gateway Protocol - A standardized exterior gateway protocol used to exchange routing and reachability information between autonomous systems on the Internet. Used for routing between customer edge and cluster border switches.

BTOR#

Border Top of Rack - Edge switches that provide connectivity between the cluster and external customer networks, typically with uplinks to customer edge devices.

CIDR#

Classless Inter-Domain Routing - A method for allocating IP addresses and IP routing that allows more efficient use of IP addresses than the older classful network addressing architecture.

COMe#

Computer on Module - A type of single-board computer designed to plug into a carrier board, baseboard, or backplane for expanded I/O functionality.

cm-lite-daemon#

Configuration Management Lite Daemon - A lightweight configuration management service used to automate and monitor NVSwitch setup and integration.

Cumulus Linux#

Cumulus Linux - A network operating system that runs on modern, white box switches and routers, currently recommended version 5.11 for SuperPOD deployments.

DAC#

Direct Attach Cable - A type of high-speed cable assembly used in data centers for short-distance connections between network equipment.

DGX#

DGX - NVIDIA’s line of AI supercomputing systems designed for deep learning and AI workloads, including GB200 and B200 models.

DHCP#

Dynamic Host Configuration Protocol - A network management protocol used to automate the process of configuring devices on IP networks.

EVPN#

Ethernet VPN - A BGP-based control plane for layer 2 and layer 3 VPN services over an IP/MPLS network.

Factory File#

Factory File - A file containing component-level MAC address, interface, serial number, and part number information for devices, provided by the manufacturing partner.

Flow#

Flow - A term more precisely called “Connection Type” that describes the specific type of network connection or traffic pattern between network devices or endpoints.

FTOR#

Fabric Top of Rack - Switches used for fabric management and InBand connectivity, typically using SN2201 switches for UFM and NMX servers.

GB200#

GB200 - A high-performance DGX system model featuring Grace Blackwell architecture, designed for AI and HPC workloads.

HCL#

Hardware Compatibility List - A list of hardware components that have been tested and verified to work with specific software or system configurations.

HCA#

Host Channel Adapter - A hardware component that connects a CPU to a network interface card (NIC) or other I/O devices.

InfiniBand#

InfiniBand - A high-speed networking standard used for high-performance computing, providing low latency and high bandwidth connectivity.

IPMI#

Intelligent Platform Management Interface - A set of computer interface specifications for autonomous monitoring and recovery of computer systems.

L1#

Layer 1 - The physical layer of the OSI model, referring to the electrical and physical specifications of the data connection.

L2#

Layer 2 - The data link layer of the OSI model, responsible for node-to-node delivery of data frames.

L3#

Layer 3 - The network layer of the OSI model, responsible for packet forwarding including routing through intermediate routers.

MAC#

Media Access Control - A unique identifier assigned to network interfaces for communications at the data link layer of a network segment.

netautogen#

Network Auto Generate - A SuperPOD network configuration setup tool that is included in BCM 11 that automatically generates network configurations based on point-to-point (P2P) connectivity and site information. The tool is also known as bcm-netautogen.

NMX#

NVIDIA Management eXtension - Management servers used in SuperPOD deployments for cluster orchestration and control plane functions.

North-South#

North-South - Network traffic flow between the data center (cluster) and external networks or clients, as opposed to East-West traffic within the cluster.

NVLink Switch - NVIDIA’s high-bandwidth, low-latency switch technology designed to interconnect GPUs in multi-GPU systems and clusters.

NVIS#

NVIDIA Infrastructure Services - A team responsible for the cabling and connectivity of the SuperPOD deployment.

NVOS#

NVIDIA Operating System - The operating system that runs on NVLink Switch devices.

NVUE#

NVIDIA User Experience - A command-line interface and API for configuring and managing Cumulus Linux switches.

OOB#

Out of Band - A management network that provides access to network infrastructure devices through a dedicated management interface, separate from the main data network.

P2P#

Point-to-Point - Direct connections between two network devices or endpoints, referring to the cabling and connectivity plan.

PAM4#

Pulse Amplitude Modulation 4-level - A modulation scheme used in high-speed data transmission that encodes two bits per symbol.

PDU#

Power Distribution Unit - A device fitted with multiple outputs designed to distribute electric power to computing equipment within a data center rack.

POD#

Point of Deployment - A logical grouping of resources in a SuperPOD deployment, typically containing multiple racks and associated networking equipment.

SIB#

Site Information Blueprint - A document containing detailed deployment information including network requirements, cabling plans, and site-specific configurations.

SN2201#

SN2201 - A 48-port 1 Gigabit Ethernet switch model used for out-of-band management in SuperPOD deployments.

SN5600#

SN5600 - A 64-port 800 Gigabit Ethernet switch model used for high-speed data connectivity in SuperPOD deployments.

SPINE#

SPINE - Aggregation switches in a leaf-spine network topology that provide connectivity between leaf switches (TORs).

Splunk#

Splunk - A data analytics platform used for searching, monitoring, and analyzing machine-generated data from IT infrastructure, applications, and security systems. In SuperPOD deployments, Splunk is commonly used for log aggregation, performance monitoring, and operational intelligence across cluster components.

STOR#

Storage Leaf - Switches dedicated to connecting storage appliances and systems to the network fabric.

SuperPOD#

SuperPOD - NVIDIA’s reference architecture for large-scale AI computing clusters, combining DGX systems with optimized networking and storage.

SuperSPINE#

SuperSPINE - Higher-level spine switches used in large multi-POD deployments to interconnect multiple spine layers.

TOR#

Top of Rack - Switches positioned at the top of server racks that provide network connectivity to servers and other equipment in the rack.

UFM#

Unified Fabric Manager - NVIDIA’s InfiniBand fabric management software for monitoring, managing, and optimizing InfiniBand networks.

VNI#

VXLAN Network Identifier - A 24-bit segment ID used in VXLAN to identify the virtual network segment.

VRF#

Virtual Routing and Forwarding - A technology that allows multiple instances of a routing table to coexist within the same router simultaneously.

VTEP#

VXLAN Tunnel Endpoint - The entity in VXLAN that originates and/or terminates VXLAN tunnels.

VXLAN#

Virtual Extensible LAN - A network virtualization technology that uses a VLAN-like encapsulation technique to encapsulate layer 2 Ethernet frames within layer 4 UDP packets.

ZTP#

Zero Touch Provisioning - An automated method for configuring network devices when they boot up for the first time, without manual intervention.