Known Issues

NVIDIA MELLANOX NEO DOCUMENTATION

This section lists the known issues in this version of Mellanox NEO ® with available workarounds.

Ref. #

Issue

2388786

Description: Network map periodically refreshes. When that happens, selected items become unselected.

Workaround: N/A

Keywords: Network Map

Detected in version: 2.7

2412055

Description: The What Just Happened button "Export to CSV" is missing from UI.

Workaround: N/A

Keywords: WJH, WebUI

Detected in version: 2.7

2187852

Description: Any change related to the "RoCE" sub-category under "Events Policy" on NEO 2.4 or below will be lost upon NEO upgrade. User is required to reconfigure it after the upgrade.

Workaround: N/A

Keywords: RoCE, upgrade

Detected in version: 2.7

2107670

Description: When working with NEO, any switch configuration done directly on the switch (via switch CLI) might be conflicted with NEO configuration and interfere with NEO switch management and configuration.

Workaround: N/A

Keywords: NEO, CLI, configuration, conflict, overwrite

Detected in version: 2.7

2225134

Description: In Streaming Settings, the events "TTL value is too small" and "Packet size is larger than MTU" (under Forwarding > L3) are always streamed even if configured otherwise.

Workaround: N/A

Keywords: Streaming, MTU, TTL

Detected in version: 2.7

2147008

Description: RoCE service does not display MLAG port-channel traffic.

Workaround: N/A

Keywords: RoCE, MPo

Detected in version: 2.7

2211606

Description: The NEO dashboard, the WJH dashboard specifically, may at times become slow to respond.

Workaround: N/A

Keywords: WJH, dashboard, slow

Detected in version: 2.7

2248362

Description: Upgrading Mellanox NEO to software version 2.7 rebuilds the database and removes all Telemetry and WJH data.

Workaround: N/A

Keywords: WJH, database

Detected in version: 2.7

2239514

Description: Any task with a snapshot created on top of it in NEO 2.4 version (or older) is displayed under "Telemetry" → "Snapshots", not under "Tasks".

Workaround: N/A

Keywords: Telemetry, task, snapshot

Detected in version: 2.7

2239662

Description: Any collector added to a telemetry session in NEO version 2.4 (or older) is detached from the session after Mellanox NEO upgrade.

Workaround: Reattach collector after software upgrade.

Keywords: Collector, session, upgrade

Detected in version: 2.7

2245329

Description: WJH buffer drop trap_probability is probability to extract the packet from stream of the packets that get exception. If the packet rate is low the extraction rate can deviate from configured value.

Workaround: N/A

Keywords: WJH

Detected in version: 2.7

2098905

Description: NEO OpenStack integration is supported only over HTTPS protocol (HTTP is not supported).

Workaround: N/A

Keywords: HTTPS, OpenStack

Detected in version: 2.6

2119441

Description: NEO supports up to to 10 MPOs in a single "apply MLAG" service.

Workaround: If more MPOs are required, the service must be updated and re-applied.

Keywords: MLAG, MPO

Detected in version: 2.6

-

Description: When interoperating with switch systems installed with Onyx 3.9.0300, they may reach high CPU utilization.

Workaround: Run the command "ssh server login record-period 1" in order to avoid this.

Keywords: Onyx, high CPU utilization

Detected in version: 2.6

2126093

Description: Mellanox Onyx ® switches support up to 64 buffer histogram samplings.

Workaround: N/A

Keywords: Histogram, buffer events, telemetry

Detected in version: 2.6

2118673

Description: The following 3rd party systems are not supported by Mellanox NEO ® : Arista, Brocade, Cisco, Juniper and HP.

It is not possible to add new systems from these vendors to NEO. However, if NEO is upgraded from an older version where these switch systems have been added, then they can be presented and managed without issue. NEO is still able to detect these 3rd party systems by IP range scanning or by LLDP.

Workaround: N/A

Keywords: 3rd party, switch systems

Detected in version: 2.6

2107670

Description: Switch configuration performed directly on the switches (via switch CLI) may conflict with Mellanox NEO configuration and interfere with NEO switch management and configuration.

Workaround: Only use NEO for managing configuration over the managed switches (avoid manual configuration on the managed switches).

Keywords: Configuration, CLI

Detected in version: 2.6

1918927

Description: In-band migration not supported.

Workaround: N/A

Keywords: VLAN provisioning

Detected in version: 2.6

1912682

Description: Telemetry Agent does not provide telemetry information on split ports if they are configured while the agent is running.

Workaround: Restart Telemetry Agent.

Keywords: Telemetry Agent, split ports

Detected in version: 2.6

1952279

Description: Every new device added to NEO must have a unique management IPv4 address, otherwise the displayed devices data might be corrupted.

Workaround: N/A

Keywords: Management Elements

Detected in version: 2.6

2082427

Description: When configuring LLDP on the host interface, LLDP must be configured to publish the host management IPv4 address. Otherwise, the host is not presented correctly in Mellanox NEO.

Workaround: N/A

Keywords: LLDP

Detected in version: 2.6

2061726

Description: When adding a Cumulus switch to Mellanox NEO, the initial discovery results in SNMP failure. Once SNMP is configured on the switch, it returns to normal status (i.e. "OK").

Workaround: N/A

Keywords: Cumulus, SNMP

Detected in version: 2.6

Description: NEO v2.5.1 supports up to 50 managed switches. If Mellanox NEO is managing more than 20 switches, it is recommended to use SSD disk otherwise NEO performance issues are expected.

Workaround: N/A

Keywords: Managed switches, performance

Detected in version: 2.5.1

Description: Running the Telemetry Agent configuration provisioning templates will attempt to restart the Agent, but it will fail to start it.

Affected Telemetry Agent Provisioning Templates:

  • Agent-Active-Ports-Update

  • Agent-Interval-Factor-Change

  • Agent-Port-Channel-Discovery

Workaround:

Edit the template in NEO:

  1. Click the “Edit” option.

  2. Replace the command docker exec neo-agent "/etc/init.d/telemetryd restart" with the command fae docker cmd "restart neo-agent"

  3. Click the “Apply” button to save the changes.

Keywords: Telemetry Agent Provisioning Templates

Detected in version: 2.5.1

Description: Running WJH on a Cumulus switch is not supported. (Up to the release of NEO v2.5, no Cumulus version that supports WJH existed.)

Workaround: N/A

Keywords: WHJ, Support, Cumulus

Detected in version: 2.5

Description: Running WJH and Threshold Events telemetry sessions on Onyx switches is supported only for Onyx version 3.8.2004 or newer.

Workaround: N/A

Keywords: WJH, Threshold events, Onyx Version

Detected in version: 2.5

1922607

Description: If a device (Linux host or switch) is removed from NEO while some Mellanox switches are running telemetry, then all the telemetry sessions running on these switches will be stopped.

Workaround: Manually disable and enable telemetry sessions using NEO (Telemetry → Streaming) in order to reactivate the required telemetry sessions.

Keywords: Telemetry, Device, Remove, Switch, Session, Stop

Detected in version: 2.5

1917681

Description: NEO monitoring over SNMP is not supported for Cumulus switches (due to a known issue in Cumulus switch).

Workaround: N/A

Keywords: SNMP, Monitoring, Cumulus

Detected in version: 2.5

1917323

Description: If a switch is unresponsive, NEO will not display a continuous graph of the monitoring data.

Workaround: N/A

Keywords: Unresponsive, Switch, Continuous, Monitoring

Detected in version: 2.5

1920182

Description: General device information (Memory and CPU) might be displayed at a delay of 2-4 minutes after the device has been added to NEO.

Workaround: N/A

Keywords: General information, Memory, CPU, Delay

Detected in version: 2.5

1920520

Description: In Cumulus switch, in case of “non-ascii” characters used in the switch configuration files, creating configuration backup and network snapshots or restoring from them might fail.

Workaround: N/A

Keywords: Cumulus, “non-ascii”, Characters, Configuration

Detected in version: 2.5

1920601

Description: When editing MLAG port channels via MLAG wizard, configuration changes might fail in case telemetry was configured prior to the change (e.g., via the Bring-Up Wizard).

Workaround: N/A

Keywords: MLAG, Editing, Configuration, Telemetry

Detected in version: 2.5

1887761

Description: Telemetry agent will not publish telemetry data for MLAG port channel in the following cases:

  • MLAG port channel of MLAG slave switch

  • MLAG port channel was disabled and enabled on MLAG master switch

In these cases, telemetry data is published for the physical ports (the MLAG port channel members).

Workaround: N/A

Keywords: MLAG, Telemetry, MLAG port channel, disabled, enabled

Detected in version: 2.5

Description: When upgrading NEO v2.4 to NEO v2.5, due to the transition from Graphite to InfluxDB, historical counters data kept on Graphite will not be transferred to the InfluxDB.

Workaround: N/A

Keywords: Upgrade, Graphite, InfluxDB, counters

Detected in version: 2.5

1848870

Description: General information (CPU and Memory information) for Cumulus switches managed by NEO are not displayed in the NEO interface until it is exposed by the switch. For more information, please refer to Exposing CPU and Memory Information via SNMP.

Workaround: N/A

Keywords: CPU and Memory information, Cumulus switches

Detected in version: 2.5

Description: RoCE Service configuration is not supported for Onyx versions prior to 3.6.5000.

Workaround: Upgrade the switch to the latest Onyx version.

Keywords: Services, RoCE, Onyx

Detected in version: 2.5

Description: RoCE Service configuration cleanup is not supported for services upgraded from older NEO versions.

Workaround: Remove the old service and recreate it with the latest NEO.

Keywords: Services, Clean-up, RoCE, upgrade

Detected in version: 2.5

Description: RoCE Service configuration cleanup is supported only for Onyx and Cumulus switches.

Workaround: For other service types, remove the configuration manually using switch CLI.

Keywords: Services, Clean-up, RoCE

Detected in version: 2.5

Description: Service configuration clean-up is supported only for RoCE service.

Workaround: For other service types, remove the configuration manually using switch CLI.

Keywords: Services, Clean-up

Detected in version: 2.5

1329530

Description: Manual HA takeover or failover might take up to 60 seconds (depending on the machine NEO is running on). During that time, triggering additional failover or takeover operations might result in the original action failure.

Workaround: Wait for at least 60 seconds between HA operations – failover or takeover.

Keywords: HA, failover, takeover

Detected in version: 2.4

1600868

Description: Configuring RoCE on a host bond interface is currently not supported.

Workaround: Configure RoCE on the bond slaves.

Keywords: Bond, RoCE

Detected in version: 2.3

1582800

Description: Running too many frequent live monitoring sessions for a specific switch may overload the switch's JSON API and result in timeouts.

Workaround: Run fewer live monitoring sessions in parallel.

Keywords: JSON, timeout, live monitoring

Detected in version: 2.3

Description: When a WJH is enabled on the Telemetry Agent, WJH on the Onyx switch is disabled (the user is not able to view WJH details via Onyx switch CLI) and vice versa.

Workaround: N/A

Keywords: What Just Happened, WJH, Onyx, Telemetry Agent

Detected in version: 2.3

2090123

Description: WJH is supported by NEO only for Onyx Spectrum switches using v3.7.1134, or newer.

Workaround: N/A

Keywords: What Just Happened, WJH, Dropped Packets

Detected in version: 2.3

Description: NEO-Host installation is supported only for Linux hosts, using one of the following HCAs: ConnectX-4 / ConnectX-4 Lx / ConnectX-5.

Workaround: N/A

Keywords: NEO-Host

Detected in Version: 2.3

1578231

Description: NEO telemetry agent can stream Routing Table information up to 20K records, and MAC table information up to 800 records.

Workaround: N/A

Keywords: Telemetry Agent, Routing Table, MAC Table

Detected in Version: 2.3

1417273

Description: System icons are not shown for Edge and Safari systems.

Workaround: N/A

Keywords: Network Map, Edge, Safari

Detected in Version: 2.2

1504128

Description: The network path calculation requires that all switches along the path will have the same SSH credentials. Otherwise, the calculation will fail.

Workaround: N/A

Keywords: Network Path, SSH, Credentials

1484291

Description: The telemetry agent cannot be stopped on switches running Onyx OS v3.6.8100.

Workaround: Do not deploy the telemetry agent on Onyx OS v3.6.8100.

Keywords: Telemetry Agent, Onyx

1421369

Description: The "In Packets rate" calculated counter shows an incorrect value for Cumulus switches only, due to an issue with the switch (the Unicast RX Packets counter always returns a value of zero).

Workaround: N/A

Keywords: In Packets Rate, Cumulus, Unicast RX Packets

1498434

Description: The network path calculation will display the links transmitted bandwidth utilization according to the maximal value of the aggregated links (in case of a multiple links connection).

Workaround: N/A

Keywords: Network Path, Bandwidth, Utilization

1332120

Description: Telemetry Agent does not support split ports.

Workaround: N/A

Keywords: Telemetry Agent, Split Port

1328501

Description: In the MLAG service, the bond is configured with the default gateway.

Workaround: Configure a different static route to the relevant ports.

Keywords: MLAG, Bond

1316429

Description: Port live monitoring only works from a certain Onyx version.

Workaround: N/A

Keywords: Telemetry, Live Monitoring

1327385

Description: Upgrade procedure (from an older version to 2.1.0) does not include Events Policy and RoCE Service.

Workaround: N/A

Keywords: Upgrade

1309655

Description: Telemetry session interval cannot be changed

Workaround: N/A

Keywords: Telemetry Agent

1298137

Description: When loading images with a similar name (differed only by tag) the 1st image name becomes empty due to an issue in Red Hat Docker.

Workaround: N/A

Keywords: Docker, Container

1272497

Description: There is no validation for the maximum ECN value in RoCE Service. The max allowed ECN value is dynamic and depends on switch type, current memory state , etc.

Workaround: N/A

Keywords: RoCE Service

1277047

Description: Configuring one of the IPL ports in MLAG service to 'switchport mode trunk' fails the service.

Workaround: Reset switchport mode before adding the port to the IPL.

Keywords: MLAG Service

1302777

Description: Switch reboot stops a telemetry agent session (if running).

Workaround: After switch reboot, manually restart the telemetry session.

Keywords: Telemetry

1071652

Description: For optimized UI functionalities, LastPass browser add-on should either be disabled or not installed.

Workaround: N/A

Keywords: UI, LastPass

Description: A NEO-Host package installation is required for successful provisioning of RoCE through the new RoCE service.

Workaround: Install NEO-HOST either on Linux-without-Neo-Host-installed predefined group or on a specific host.

Keywords: RoCE Service

1064979

Description: The MLAG service is supported in MLNX Onyx (MLNX_OS) starting from v3.6.4000.

Workaround: Make sure to upgrade your Onyx version to v3.6.4000 or above.

Keywords: MLAG Service

Description: When using SNMPv3 with sha authentication and priv=aes128 option, the switch will become unreachable due to timeout.

Workaround: For Mellanox PPC switches, use md5 authentication with a priv=des option.

Keywords: Authentication

Description: Mellanox NEO Client (browser) might fail to connect to the NEO server in case the iptables service is running.

Workaround: Make sure to disable the iptables service before running NEO installation.

Keywords: Installation

Description: VLANs and LAGs information may not be displayed as part of device information for non-Mellanox devices.

Workaround: N/A

Keywords: 3rd Party Systems Support

Description: Linux/Windows host provisioning via NEO is non-persistent.

Workaround: N/A

Keywords: Host Provisioning

Description: NEO start-up will fail in case the machines’ local time zone is not configured.

Workaround: Make sure the installed machines’ local time zone is configured. (/etc/localtime file exists).

Keywords: NEO Start-Up

Description: Apply Config operation is only available for switches with Onyx v3.6.2000 and above.

Workaround: N/A

Keywords: Configuration Management

Description: Cable information is only supported for Mellanox Onyx switch ports.

Workaround: N/A

Keywords: Cable Information

951789

Description: Performance tests are only supported for ConnectX-4 and ConnectX-5 family adapter cards.

Workaround: N/A

Keywords: Performance Check

Description: Performance check can be performed only on two Linux hosts, running MLNX_OFED_LINUX-3.3-1.0.4.0 version or higher.

Workaround: N/A

Keywords: Performance Check

Description: RoCE configuration on hosts is non-persistent.

Workaround: N/A

Keywords: RoCE Service

© Copyright 2023, NVIDIA. Last updated on Nov 14, 2023.