NVIDIA NVOS User Manual for InfiniBand Switches v25.02.6007

Overview

These pages are intended for network administrators who are responsible for configuring and managing NVOS platforms.

The following table lists the documents referenced in this user manual.

Document Name

Description

System Hardware User Manual

This document contains hardware descriptions, LED assignments, and hardware specifications.

Switch Product Release Notes

Look up the relevant switch system/series release notes file.

Term

Description

AAA

Authentication, Authorization, and Accounting. A security framework for controlling and tracking user access within a computer network. Authentication verifies user credentials; Authorization grants or denies privileges; Accounting tracks user activities and resource consumption.

ACL

Access Control List. A set of filtering rules applied to interfaces or ports that control network traffic based on source/destination IP, port number, or protocol type.

ARP

Address Resolution Protocol. Maps IP addresses to physical MAC addresses for communication within a local area network (LAN).

ASIC

Application-Specific Integrated Circuit. A custom-designed chip optimized for specific functions such as packet forwarding in network switches.

BIOS

Basic Input/Output System. Firmware that initializes and tests hardware components during system boot and loads the operating system.

BMC

Baseboard Management Controller. A service processor that enables out-of-band hardware monitoring and management (temperature, fan speed, power).

CLI

Command-Line Interface. A text-based interface used to configure, manage, and monitor the switch through command input.

Core Dump

A file containing memory and process state at the time of a crash, used for debugging.

Crashdump

Diagnostic data collected automatically after a system or process failure.

CPLD

Complex Programmable Logic Device. A programmable logic device used for control logic, often handling board-level management and power sequencing.

DHCP

Dynamic Host Configuration Protocol. Automatically assigns IP addresses and network configuration parameters to devices on a network.

DNS

Domain Name System. Translates human-readable domain names into corresponding IP addresses for network communication.

ERoT

External Root of Trust. A secure cryptographic hardware component that verifies system integrity and authenticates firmware during boot.

EEPROM

Electrically Erasable Programmable Read-Only Memory. Non-volatile memory used to store firmware or configuration data that must persist across power cycles.

Ethernet

A family of networking technologies used for LAN communication, defining physical and data link layer protocols.

FPGA

Field Programmable Gate Array. A reconfigurable integrated circuit that can be programmed post-manufacturing to implement custom logic or accelerate specific functions.

Fabric

The interconnected topology of switches and links forming a unified network for data transfer, commonly used in high-performance environments.

FRU

Field Replaceable Unit. A hardware component that can be replaced without special tools, typically including fans, power supplies, and management modules.

FTP / TFTP / SFTP

File Transfer Protocols. Protocols used to transfer files between systems. FTP uses TCP; TFTP is simplified and connectionless; SFTP provides secure encrypted transfer over SSH.

gNMI

gRPC Network Management Interface. A streaming-based protocol for network configuration and telemetry using gRPC, supporting both configuration and real-time monitoring.

GUI

Graphical User Interface. A visual interface allowing users to interact with and configure the system via icons, menus, and windows rather than command lines.

Host

A device connected to a network that can send, receive, and process data with other network devices.

ICMP

Internet Control Message Protocol. Used to send control messages and error notifications between network devices, commonly used for ping tests.

I2C

Inter-Integrated Circuit. A low-speed serial communication bus used internally on circuit boards for connecting components such as sensors, EEPROMs, or CPLDs.

LACP

Link Aggregation Control Protocol. A protocol for combining multiple physical network links into a single logical interface for redundancy and performance.

LAG

Link Aggregation Group. A logical grouping of multiple physical links to increase bandwidth and provide redundancy.

Loopback Interface

A virtual interface used for testing, routing, and management reachability.

MAC

Media Access Control Address. A unique hardware identifier assigned to each network interface for communication on the physical layer.

MTU

Maximum Transmission Unit. The largest payload size (in bytes) that can be sent in a single packet without fragmentation.

NTP

Network Time Protocol. Synchronizes system clocks of devices across a network to maintain accurate timekeeping.

NVRAM

Non-Volatile Random Access Memory. Memory that retains data even when power is turned off, often used for configuration storage.

NVLink

A high-speed interconnect developed by NVIDIA that allows direct communication between GPUs or between GPU and CPU.

NVOS

NVIDIA Operating System. Provides system and network management functionality for NVLink switches.

Network Adapter

A hardware component that enables communication between a computer or device and a network.

Overlay Network

A virtual network built on top of a physical underlay to abstract or isolate traffic, e.g., VXLAN-based networks.

PCIe

Peripheral Component Interconnect Express. A high-speed interface standard for connecting hardware components like NICs and GPUs.

QoS

Quality of Service. A set of mechanisms that prioritize or limit network traffic based on policies to ensure performance for critical applications.

REST API

Representational State Transfer API. A web-based interface that allows external systems to configure and monitor devices using HTTP/HTTPS requests.

Routing Table

A data table stored in a switch or router that lists paths to particular network destinations.

SA

Subnet Agent. A process running on each node that communicates with the Subnet Manager (SM) to maintain fabric topology and routing data.

SCP

Secure Copy Protocol. A secure file transfer method that uses SSH to copy files between local and remote hosts.

SM

Subnet Manager. The central process that initializes and maintains the InfiniBand or NVLink subnet, assigns local IDs (LIDs), and manages routing.

SNMP

Simple Network Management Protocol. Used for monitoring and managing network devices and collecting operational data.

SNTP

Simple Network Time Protocol. A simplified version of NTP for basic clock synchronization.

SPI

Serial Peripheral Interface. A synchronous serial communication interface used for short-distance communication between microcontrollers and peripherals.

SSH

Secure Shell. A cryptographic network protocol enabling secure remote login and command execution between devices.

syslog

A standard protocol for sending system log or event messages to a centralized log server for monitoring and analysis.

TACACS+

Terminal Access Controller Access-Control System Plus. A protocol that provides centralized authentication, authorization, and accounting for network device access.

TPM

Trusted Platform Module. A hardware chip that provides hardware-based security, cryptographic key storage, and secure boot validation.

Underlay Network

The physical network that provides IP connectivity for overlay networks or tunnels.

VRF

Virtual Routing and Forwarding. Enables multiple routing tables to coexist on the same switch, allowing network segmentation and traffic isolation.

VLAN

Virtual Local Area Network. A logical grouping of devices that communicate as if on the same LAN, even if physically separated.

Watchdog Timer

A hardware or software timer that resets the system if it becomes unresponsive.

ZTP

Zero-Touch Provisioning. Automates initial device configuration by allowing the switch to fetch and apply configuration files and firmware on first boot.

Feature

Detail

Software management

  • Software updates

  • Firmware updates

Logging

  • System log

  • Tech-support

Management interface

  • IP

  • DHCP

  • Hostname

Chassis management

  • Monitoring environmental controls

Network management interfaces

  • OpenAPI

Security

  • Password Hardening

Cables & transceivers

  • Transceiver info

Feature

Detail

Subnet manager

  • OpenSM

IB port management

  • IB port interfaces

IB Fabric

  • IB devices

image2022-9-21_17-6-30-version-1-modificationdate-1763289515916-api-v2.png

© Copyright 2025, NVIDIA. Last updated on Nov 16, 2025