NVIDIA NEO v2.7
NVIDIA MELLANOX NEO DOCUMENTATION

Introduction

Mellanox NEO ® is a powerful platform for managing scale-out computing networks. Mellanox NEO enables data center operators to efficiently provision, monitor and operate the modern data center fabric.

Mellanox NEO serves as interface to the fabric, thereby extending existing tool capabilities into monitoring and provisioning the data center network. Mellanox NEO uses an extensive set of REST APIs to allow access to fabric-related data and provisioning activities.

Mellanox NEO eliminates the complexity of fabric management. It automates the configuration of devices, provides deep visibility into traffic and health, and provides early detection of errors and failures.

Mellanox NEO incorporates a monitoring mechanism that can be combined with Mellanox Care®, a support program that offers 24/7 fabric management services to monitor network health. This mechanism traps network events and issues regular notifications to Mellanox’s Network Operations Center (Mellanox NOC). Special Mellanox personnel analyze the details of the reported events and take action according to the service level agreement (SLA).

Mellanox NEO presents the following benefits:

  • Reduces complexity of fabric management

  • Provides in-depth visibility into traffic and health information

  • Network API supports integration, automation, and SDN programmable fabrics

  • Historical health and performance graphs

  • Generates preventive maintenance and “soft degradation” alerts

  • Quickly troubleshoots topology and connectivity issues

  • Integrates and streamlines fabric information for your IT systems

  • Combined with Mellanox Care, produces regular event notifications to Mellanox NOC for 24/7 health monitoring

Central Management Console

Mellanox NEO provides network and device management functions via one central console. Its centralized dashboard can be used to monitor, troubleshoot, configure and optimize the system via a single interface.

In-Depth Visibility and Control

Mellanox NEO includes an advanced granular monitoring engine that provides real-time access to switches, enabling cluster-wide health and performance monitoring, real-time identification of problems and failures, and quick problem resolution via granular threshold-based alerts and its utilization dashboard.

Quick Resolution of Problems

Mellanox NEO provides comprehensive information from switches, showing errors and traffic issues such as congestion. The information is presented concisely over a unified dashboard and configurable monitoring sessions. The monitored data can be correlated per job and customer, and threshold-based alarms can be set.

Open Architecture

Mellanox NEO provides an advanced REST interface and SDK package integrated with external management tools. This combination enables data center administrators to consolidate management dashboards while flawlessly sharing information among the various management applications, synchronizing overall resource scheduling, and simplifying provisioning and administration.

Mellanox NEO as Network API

Mellanox NEO serves as an interface to the fabric, thereby extending existing tool capabilities into monitoring and provisioning the data center network. Mellanox NEO uses an extensive set of REST APIs to allow access to fabric-related data and provisioning activities.

The interface can provide external tools with the fabric topology, device health and performance status, alerts, and device and fabric management actions. This allows taking advantage of existing tools and enhancing them, as well as building new DevOps oriented management frameworks.

For further information on Mellanox NEO API documentation, please refer to the NEO SDK User Manual.

Mellanox NEO architecture includes controller and service providers (Device Manager, Provisioning, Fabric Manager, Monitoring and Access Credentials Manager). The controller transfers information from the service providers to the user, as well as controls the service providers and verifies their status. It queries and fetches information from providers and performs operations such as:

  • Storing a list of supported logs per each provider

  • Pausing, resuming, resetting, and fetching a specific log

  • Maintaining a connection with a provider

image2019-3-7_15-33-33.png

Mellanox NEO Graphical User Interface

The Mellanox NEO Web GUI is the front-end of the application. It communicates with the Mellanox NEO REST API in order to retrieve and display the relevant information.

Mellanox NEO Controller

The Mellanox NEO controller is the central component enabling data collection from all the service providers. The collected data is maintained in a central repository. The controller exposes a Restful API that allows retrieving any type of information and running any type of supported actions.

Mellanox NEO Providers

The Mellanox NEO providers listed below are the data sources for the controller. Each provider is an independent process (service) which communicates with the controller.

  • Device Management Provider

  • Provisioning Provider

  • Monitoring Provider

  • Access Credentials Provider

  • IP Discovery Provider

  • Telemetry Provider

  • Ethernet Connectivity (LLDP) Discovery Provider

  • IB Provider

  • Solution Provider

  • Virtualization Provider

  • Host Manager Provider

  • Performance Provider

Mellanox NEO utilizes the following communication protocols.

Protocol

Purpose

HTTPS

Collecting chassis data regarding Mellanox devices and Windows servers

SNMP

Collecting connectivity data, monitoring data and general data from switches

SSH

Switch/Linux provisioning

© Copyright 2023, NVIDIA. Last updated on Nov 14, 2023.