Glossary of Terms#

This glossary contains definitions of key terms and concepts used in the DGX SuperPOD Ethernet North-South Network Configuration Guide.

AOC#: Active Optical Cable - A type of high-speed data transmission cable that uses fiber optic technology with integrated optical transceivers.
B200#: B200 - A DGX system model that is part of the SuperPOD deployment, smaller than the GB200 system.
BCM#: Base Command Manager - NVIDIA’s cluster management software that handles deployment, configuration, and management of DGX SuperPOD systems.
BGP#: Border Gateway Protocol - A standardized exterior gateway protocol used to exchange routing and reachability information between autonomous systems on the Internet. Used for routing between customer edge and cluster border switches.
BTOR#: Border Top of Rack - Edge switches that provide connectivity between the cluster and external customer networks, typically with uplinks to customer edge devices.
CIDR#: Classless Inter-Domain Routing - A method for allocating IP addresses and IP routing that allows more efficient use of IP addresses than the older classful network addressing architecture.
COMe#: Computer on Module - A type of single-board computer designed to plug into a carrier board, baseboard, or backplane for expanded I/O functionality.
cm-lite-daemon#: Configuration Management Lite Daemon - A lightweight configuration management service used to automate and monitor NVSwitch setup and integration.
Cumulus Linux#: Cumulus Linux - A network operating system that runs on modern, white box switches and routers, currently recommended version 5.11 for SuperPOD deployments.
DAC#: Direct Attach Cable - A type of high-speed cable assembly used in data centers for short-distance connections between network equipment.
DGX#: DGX - NVIDIA’s line of AI supercomputing systems designed for deep learning and AI workloads, including GB200 and B200 models.
DHCP#: Dynamic Host Configuration Protocol - A network management protocol used to automate the process of configuring devices on IP networks.
EVPN#: Ethernet VPN - A BGP-based control plane for layer 2 and layer 3 VPN services over an IP/MPLS network.
Factory File#: Factory File - A file containing component-level MAC address, interface, serial number, and part number information for devices, provided by the manufacturing partner.
Flow#: Flow - A term more precisely called “Connection Type” that describes the specific type of network connection or traffic pattern between network devices or endpoints.
FTOR#: Fabric Top of Rack - Switches used for fabric management and InBand connectivity, typically using SN2201 switches for UFM and NMX servers.
GB200#: GB200 - A high-performance DGX system model featuring Grace Blackwell architecture, designed for AI and HPC workloads.
HCL#: Hardware Compatibility List - A list of hardware components that have been tested and verified to work with specific software or system configurations.
HCA#: Host Channel Adapter - A hardware component that connects a CPU to a network interface card (NIC) or other I/O devices.
InfiniBand#: InfiniBand - A high-speed networking standard used for high-performance computing, providing low latency and high bandwidth connectivity.
IPMI#: Intelligent Platform Management Interface - A set of computer interface specifications for autonomous monitoring and recovery of computer systems.
L1#: Layer 1 - The physical layer of the OSI model, referring to the electrical and physical specifications of the data connection.
L2#: Layer 2 - The data link layer of the OSI model, responsible for node-to-node delivery of data frames.
L3#: Layer 3 - The network layer of the OSI model, responsible for packet forwarding including routing through intermediate routers.
MAC#: Media Access Control - A unique identifier assigned to network interfaces for communications at the data link layer of a network segment.
netautogen#: Network Auto Generate - A SuperPOD network configuration setup tool that is included in BCM 11 that automatically generates network configurations based on point-to-point (P2P) connectivity and site information. The tool is also known as bcm-netautogen.
NMX#: NVIDIA Management eXtension - Management servers used in SuperPOD deployments for cluster orchestration and control plane functions.
North-South#: North-South - Network traffic flow between the data center (cluster) and external networks or clients, as opposed to East-West traffic within the cluster.
NVLink Switch#: NVLink Switch - NVIDIA’s high-bandwidth, low-latency switch technology designed to interconnect GPUs in multi-GPU systems and clusters.
NVIS#: NVIDIA Infrastructure Services - A team responsible for the cabling and connectivity of the SuperPOD deployment.
NVOS#: NVIDIA Operating System - The operating system that runs on NVLink Switch devices.
NVUE#: NVIDIA User Experience - A command-line interface and API for configuring and managing Cumulus Linux switches.
OOB#: Out of Band - A management network that provides access to network infrastructure devices through a dedicated management interface, separate from the main data network.
P2P#: Point-to-Point - Direct connections between two network devices or endpoints, referring to the cabling and connectivity plan.
PAM4#: Pulse Amplitude Modulation 4-level - A modulation scheme used in high-speed data transmission that encodes two bits per symbol.
PDU#: Power Distribution Unit - A device fitted with multiple outputs designed to distribute electric power to computing equipment within a data center rack.
POD#: Point of Deployment - A logical grouping of resources in a SuperPOD deployment, typically containing multiple racks and associated networking equipment.
SIB#: Site Information Blueprint - A document containing detailed deployment information including network requirements, cabling plans, and site-specific configurations.
SN2201#: SN2201 - A 48-port 1 Gigabit Ethernet switch model used for out-of-band management in SuperPOD deployments.
SN5600#: SN5600 - A 64-port 800 Gigabit Ethernet switch model used for high-speed data connectivity in SuperPOD deployments.
SPINE#: SPINE - Aggregation switches in a leaf-spine network topology that provide connectivity between leaf switches (TORs).
Splunk#: Splunk - A data analytics platform used for searching, monitoring, and analyzing machine-generated data from IT infrastructure, applications, and security systems. In SuperPOD deployments, Splunk is commonly used for log aggregation, performance monitoring, and operational intelligence across cluster components.
STOR#: Storage Leaf - Switches dedicated to connecting storage appliances and systems to the network fabric.
SuperPOD#: SuperPOD - NVIDIA’s reference architecture for large-scale AI computing clusters, combining DGX systems with optimized networking and storage.
SuperSPINE#: SuperSPINE - Higher-level spine switches used in large multi-POD deployments to interconnect multiple spine layers.
TOR#: Top of Rack - Switches positioned at the top of server racks that provide network connectivity to servers and other equipment in the rack.
UFM#: Unified Fabric Manager - NVIDIA’s InfiniBand fabric management software for monitoring, managing, and optimizing InfiniBand networks.
VNI#: VXLAN Network Identifier - A 24-bit segment ID used in VXLAN to identify the virtual network segment.
VRF#: Virtual Routing and Forwarding - A technology that allows multiple instances of a routing table to coexist within the same router simultaneously.
VTEP#: VXLAN Tunnel Endpoint - The entity in VXLAN that originates and/or terminates VXLAN tunnels.
VXLAN#: Virtual Extensible LAN - A network virtualization technology that uses a VLAN-like encapsulation technique to encapsulate layer 2 Ethernet frames within layer 4 UDP packets.
ZTP#: Zero Touch Provisioning - An automated method for configuring network devices when they boot up for the first time, without manual intervention.