NVIDIA Cumulus Linux

Cumulus NetQ 3.0

Cumulus® NetQ is a highly-scalable, modern network operations tool set that utilizes telemetry for deep troubleshooting, visibility, and automated workflows from a single GUI interface, reducing maintenance and network downtimes. It combines the ability to easily upgrade, configure and deploy network elements with a full suite of operations capabilities, such as visibility, troubleshooting, validation, trace and comparative look-back functionality.

This documentation is intended for network administrators who are responsible for deploying, configuring, monitoring and troubleshooting the network in their data center or campus environment. NetQ 3.x offers the ability to easily monitor and manage your network infrastructure and operational health. The documentation provides instructions and information about monitoring individual components of the network, the network as a whole, and the NetQ software applications using the NetQ command line interface (NetQ CLI), NetQ (graphical) user interface (NetQ UI), and NetQ Admin UI.

Cumulus NetQ Deployment Guide

This guide is intended for network administrators who are responsible for installation, setup, and maintenance of Cumulus NetQ in their data center environment. NetQ offers the ability to monitor and manage your data center network infrastructure and operational health with simple tools based on open source Linux. This guide provides instructions and information about installing NetQ core capabilities, configuring optional capabilities, and upgrading an existing NetQ installation. This guide assumes you have already installed Cumulus Linux on your network switches and you are ready to add these NetQ capabilities.

For information about monitoring and troubleshooting your network, refer to the Cumulus NetQ CLI User Guide or the Cumulus NetQ UI User Guide.

Before you get started, you should review the release notes for this version.

Cumulus NetQ Overview

Cumulus® NetQ is a highly-scalable, modern network operations tool set that provides visibility and troubleshooting of your overlay and underlay networks in real-time. NetQ delivers actionable insights and operational intelligence about the health of your data center - from the container, virtual machine, or host, all the way to the switch and port. NetQ correlates configuration and operational status, and instantly identifies and tracks state changes while simplifying management for the entire Linux-based data center. With NetQ, network operations change from a manual, reactive, box-by-box approach to an automated, informed and agile one.

Cumulus NetQ performs three primary functions:

NetQ is available as an on-site or in-cloud deployment.

Unlike other network operations tools, NetQ delivers significant operational improvements to your network management and maintenance processes. It simplifies the data center network by reducing the complexity through real-time visibility into hardware and software status and eliminating the guesswork associated with investigating issues through the analysis and presentation of detailed, focused data.

Demystify Overlay Networks

While overlay networks provide significant advantages in network management, it can be difficult to troubleshoot issues that occur in the overlay one box at a time. You are unable to correlate what events (configuration changes, power outages, etc.) may have caused problems in the network and when they occurred. Only a sampling of data is available to use for your analysis. By contrast, with Cumulus NetQ deployed, you have a network-wide view of the overlay network, can correlate events with what is happening now or in the past, and have real-time data to fill out the complete picture of your network health and operation.

In summary:

Without NetQWith NetQ
Difficult to debug overlay networkView network-wide status of overlay network
Hard to find out what happened in the pastView historical activity with time-machine view
Periodically sampled dataReal-time collection of telemetry data for a more complete data set

Protect Network Integrity with NetQ Validation

Network configuration changes can cause numerous trouble tickets because you are not able to test a new configuration before deploying it. When the tickets start pouring in, you are stuck with a large amount of data that is collected and stored in multiple tools making correlation of the events to the resolution required difficult at best. Isolating faults in the past is challenging. By contract, with Cumulus NetQ deployed, you can proactively verify a configuration change as inconsistencies and misconfigurations can be caught prior to deployment. And historical data is readily available to correlate past events with current issues.

In summary:

Without NetQ

With NetQ

Reactive to trouble tickets

Catch inconsistencies and misconfigurations prior to deployment with integrity checks/validation

Large amount of data and multiple tools to
correlate the logs/events with the issues

Correlate network status, all in one place

Periodically sampled data

Readily available historical data for viewing and correlating changes in the past with current issues

Troubleshoot Issues Across the Network

Troubleshooting networks is challenging in the best of times, but trying to do so manually, one box at a time, and digging through a series of long and ugly logs make the job harder than it needs to be. Cumulus NetQ provides rolled up and correlated network status on a regular basis, enabling you to get down to the root of the problem quickly, whether it occurred recently or over a week ago. The graphical user interface makes this possible visually to speed the analysis.

In summary:

Without NetQ

With NetQ

Large amount of data and multiple tools to
correlate the logs/events with the issues

Rolled up and correlated network status, view events and status together

Past events are lost

Historical data gathered and stored for comparison with current network state

Manual, box-by-box troubleshooting

View issues on all devices all at once, pointing to the source of the problem

Track Connectivity with NetQ Trace

Conventional trace only traverses the data path looking for problems, and does so on a node to node basis. For paths with a small number of hops that might be fine, but in larger networks, it can become extremely time consuming. With Cumulus NetQ both the data and control paths are verified providing additional information. It discovers misconfigurations along all of the hops in one go, speeding the time to resolution.

In summary:

Without NetQWith NetQ
Trace covers only data path; hard to check control pathBoth data and control paths are verified
View portion of entire pathView all paths between devices all at once to find problem paths
Node-to-node check on misconfigurationsView any misconfigurations along all hops from source to destination

Cumulus NetQ Components

Cumulus NetQ contains the following applications and key components:

While these function apply to both the on-site and in-cloud solutions, where the functions reside varies, as shown here.

NetQ interfaces with event notification applications, third-party analytics tools.

Each of the NetQ components used to gather, store and process data about the network state are described here.

NetQ Agents

NetQ Agents are software installed and running on every monitored node in the network - including Cumulus® Linux® switches, Linux bare-metal hosts, and virtual machines. The NetQ Agents push network data regularly and event information immediately to the NetQ Platform.

Switch Agents

The NetQ Agents running on Cumulus Linux switches gather the following network data via Netlink:

for the following protocols:

The NetQ Agent is supported on Cumulus Linux 3.3.2 and later.

Host Agents

The NetQ Agents running on hosts gather the same information as that for switches, plus the following network data:

The NetQ Agent obtains container information by listening to the Kubernetes orchestration tool.

The NetQ Agent is supported on hosts running Ubuntu 16.04, Red Hat® Enterprise Linux 7, and CentOS 7 Operating Systems.

NetQ Core

The NetQ core performs the data collection, storage, and processing for delivery to various user interfaces. It is comprised of a collection of scalable components running entirely within a single server. The NetQ software queries this server, rather than individual devices enabling greater scalability of the system. Each of these components is described briefly here.

Data Aggregation

The data aggregation component collects data coming from all of the NetQ Agents. It then filters, compresses, and forwards the data to the streaming component. The server monitors for missing messages and also monitors the NetQ Agents themselves, providing alarms when appropriate. In addition to the telemetry data collected from the NetQ Agents, the aggregation component collects information from the switches and hosts, such as vendor, model, version, and basic operational state.

Data Stores

Two types of data stores are used in the NetQ product. The first stores the raw data, data aggregations, and discrete events needed for quick response to data requests. The second stores data based on correlations, transformations and processing of the raw data.

Real-time Streaming

The streaming component processes the incoming raw data from the aggregation server in real time. It reads the metrics and stores them as a time series, and triggers alarms based on anomaly detection, thresholds, and events.

Network Services

The network services component monitors protocols and services operation individually and on a network-wide basis and stores status details.

User Interfaces

NetQ data is available through several user interfaces:

The CLI and UI query the RESTful API for the data to present. Standard integrations can be configured to integrate with third-party notification tools.

Data Center Network Deployments

There are three deployment types that are commonly deployed for network management in the data center:

A summary of each type is provided here.

Cumulus NetQ operates over layer 3, and can be used in both layer 2 bridged and layer 3 routed environments. Cumulus Networks always recommends layer 3 routed environments whenever possible.

Out-of-Band Management Deployment

Cumulus Networks recommends deploying NetQ on an out-of-band (OOB) management network to separate network management traffic from standard network data traffic, but it is not required. This figure shows a sample CLOS-based network fabric design for a data center using an OOB management network overlaid on top, where NetQ is deployed.

The physical network hardware includes:

The diagram shows physical connections (in the form of grey lines) between Spine 01 and four Leaf devices and two Exit devices, and Spine 02 and the same four Leaf devices and two Exit devices. Leaf 01 and Leaf 02 are connected to each other over a peerlink and act as an MLAG pair for Server 01 and Server 02. Leaf 03 and Leaf 04 are connected to each other over a peerlink and act as an MLAG pair for Server 03 and Server 04. The Edge is connected to both Exit devices, and the Internet node is connected to Exit 01.

Data Center Network Example

The physical management hardware includes:

These switches are connected to each of the physical network devices through a virtual network overlay, shown with purple lines.

In-band Management Deployment

While not the preferred deployment method, you might choose to implement NetQ within your data network. In this scenario, there is no overlay and all traffic to and from the NetQ Agents and the NetQ Platform traverses the data paths along with your regular network traffic. The roles of the switches in the CLOS network are the same, except that the NetQ Platform performs the aggregation function that the OOB management switch performed. If your network goes down, you might not have access to the NetQ Platform for troubleshooting.

High Availability Deployment

NetQ supports a high availability deployment for users who prefer a solution in which the collected data and processing provided by the NetQ Platform remains available through alternate equipment should the platform fail for any reason. In this configuration, three NetQ Platforms are deployed, with one as the master and two as workers (or replicas). Data from the NetQ Agents is sent to all three switches so that if the master NetQ Platform fails, one of the replicas automatically becomes the master and continues to store and provide the telemetry data. This example is based on an OOB management configuration, and modified to support high availability for NetQ.

Cumulus NetQ Operation

In either in-band or out-of-band deployments, NetQ offers network-wide configuration and device management, proactive monitoring capabilities, and performance diagnostics for complete management of your network. Each component of the solution provides a critical element to make this possible.

The NetQ Agent

From a software perspective, a network switch has software associated with the hardware platform, the operating system, and communications. For data centers, the software on a Cumulus Linux network switch would be similar to the diagram shown here.

The NetQ Agent interacts with the various components and software on switches and hosts and provides the gathered information to the NetQ Platform. You can view the data using the NetQ CLI or UI.

The NetQ Agent polls the user space applications for information about the performance of the various routing protocols and services that are running on the switch. Cumulus Networks supports BGP and OSPF Free Range Routing (FRR) protocols as well as static addressing. Cumulus Linux also supports LLDP and MSTP among other protocols, and a variety of services such as systemd and sensors . For hosts, the NetQ Agent also polls for performance of containers managed with Kubernetes. All of this information is used to provide the current health of the network and verify it is configured and operating correctly.

For example, if the NetQ Agent learns that an interface has gone down, a new BGP neighbor has been configured, or a container has moved, it provides that information to the NetQ Platform. That information can then be used to notify users of the operational state change through various channels. By default, data is logged in the database, but you can use the CLI (netq show events) or configure the Event Service in NetQ to send the information to a third-party notification application as well. NetQ supports PagerDuty and Slack integrations.

The NetQ Agent interacts with the Netlink communications between the Linux kernel and the user space, listening for changes to the network state, configurations, routes and MAC addresses. NetQ uses this information to enable notifications about these changes so that network operators and administrators can respond quickly when changes are not expected or favorable.

For example, if a new route is added or a MAC address removed, NetQ Agent records these changes and sends that information to the NetQ Platform. Based on the configuration of the Event Service, these changes can be sent to a variety of locations for end user response.

The NetQ Agent also interacts with the hardware platform to obtain performance information about various physical components, such as fans and power supplies, on the switch. Operational states and temperatures are measured and reported, along with cabling information to enable management of the hardware and cabling, and proactive maintenance.

For example, as thermal sensors in the switch indicate that it is becoming very warm, various levels of alarms are generated. These are then communicated through notifications according to the Event Service configuration.

The NetQ Platform

Once the collected data is sent to and stored in the NetQ database, you can:

Validate Configurations

The NetQ CLI enables validation of your network health through two sets of commands: netq check and netq show. They extract the information from the Network Service component and Event service. The Network Service component is continually validating the connectivity and configuration of the devices and protocols running on the network. Using the netq check and netq show commands displays the status of the various components and services on a network-wide and complete software stack basis. For example, you can perform a network-wide check on all sessions of BGP with a single netq check bgp command. The command lists any devices that have misconfigurations or other operational errors in seconds. When errors or misconfigurations are present, using the netq show bgp command displays the BGP configuration on each device so that you can compare and contrast each device, looking for potential causes. netq check and netq show commands are available for numerous components and services as shown in the following table.

Component or ServiceCheckShowComponent or ServiceCheckShow
AgentsXXLLDPX
BGPXXMACsX
CLAG (MLAG)XXMTUX
EventsXNTPXX
EVPNXXOSPFXX
InterfacesXXSensorsXX
InventoryXServicesX
IPv4/v6XVLANXX
KubernetesXVXLANXX
LicenseX

Monitor Communication Paths

The trace engine is used to validate the available communication paths between two network devices. The corresponding netq trace command enables you to view all of the paths between the two devices and if there are any breaks in the paths. This example shows two successful paths between server12 and leaf11, all with an MTU of 9152. The first command shows the output in path by path tabular mode. The second command show the same output as a tree.

cumulus@switch:~$ netq trace 10.0.0.13 from 10.0.0.21
Number of Paths: 2
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9152
Id  Hop Hostname    InPort          InTun, RtrIf    OutRtrIf, Tun   OutPort
--- --- ----------- --------------- --------------- --------------- ---------------
1   1   server12                                                    bond1.1002
    2   leaf12      swp8                            vlan1002        peerlink-1
    3   leaf11      swp6            vlan1002                        vlan1002
--- --- ----------- --------------- --------------- --------------- ---------------
2   1   server12                                                    bond1.1002
    2   leaf11      swp8                                            vlan1002
--- --- ----------- --------------- --------------- --------------- ---------------
 
 
cumulus@switch:~$ netq trace 10.0.0.13 from 10.0.0.21 pretty
Number of Paths: 2
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9152
 hostd-12 bond1.1002 -- swp8 leaf12 <vlan1002> peerlink-1 -- swp6 <vlan1002> leaf11 vlan1002
          bond1.1002 -- swp8 leaf11 vlan1002

This output is read as:

If the MTU does not match across the network, or any of the paths or parts of the paths have issues, that data is called out in the summary at the top of the output and shown in red along the paths, giving you a starting point for troubleshooting.

View Historical State and Configuration

All of the check, show and trace commands can be run for the current status and for a prior point in time. For example, this is useful when you receive messages from the night before, but are not seeing any problems now. You can use the netq check command to look for configuration or operational issues around the time that the messages are timestamped. Then use the netq show commands to see information about how the devices in question were configured at that time or if there were any changes in a given timeframe. Optionally, you can use the netq trace command to see what the connectivity looked like between any problematic nodes at that time. This example shows problems occurred on spine01, leaf04, and server03 last night. The network administrator received notifications and wants to investigate. The diagram is followed by the commands to run to determine the cause of a BGP error on spine01. Note that the commands use the around option to see the results for last night and that they can be run from any switch in the network.

cumulus@switch:~$ netq check bgp around 30m
Total Nodes: 25, Failed Nodes: 3, Total Sessions: 220 , Failed Sessions: 24,
Hostname          VRF             Peer Name         Peer Hostname     Reason                                        Last Changed
----------------- --------------- ----------------- ----------------- --------------------------------------------- -------------------------
exit-1            DataVrf1080     swp6.2            firewall-1        BGP session with peer firewall-1 swp6.2: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1080     swp7.2            firewall-2        BGP session with peer firewall-2 (swp7.2 vrf  1d:1h:59m:43s
                                                                      DataVrf1080) failed,                         
                                                                      reason: Peer not configured                  
exit-1            DataVrf1081     swp6.3            firewall-1        BGP session with peer firewall-1 swp6.3: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1081     swp7.3            firewall-2        BGP session with peer firewall-2 (swp7.3 vrf  1d:1h:59m:43s
                                                                      DataVrf1081) failed,                         
                                                                      reason: Peer not configured                  
exit-1            DataVrf1082     swp6.4            firewall-1        BGP session with peer firewall-1 swp6.4: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1082     swp7.4            firewall-2        BGP session with peer firewall-2 (swp7.4 vrf  1d:1h:59m:43s
                                                                      DataVrf1082) failed,                         
                                                                      reason: Peer not configured                  
exit-1            default         swp6              firewall-1        BGP session with peer firewall-1 swp6: AFI/SA 1d:2h:6m:21s
                                                                      FI evpn not activated on peer                
exit-1            default         swp7              firewall-2        BGP session with peer firewall-2 (swp7 vrf de 1d:1h:59m:43s
...
 
cumulus@switch:~$ netq exit-1 show bgp
Matching bgp records:
Hostname          Neighbor                     VRF             ASN        Peer ASN   PfxRx        Last Changed
----------------- ---------------------------- --------------- ---------- ---------- ------------ -------------------------
exit-1            swp3(spine-1)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp3.2(spine-1)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.3(spine-1)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.4(spine-1)              DataVrf1082     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4(spine-2)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp4.2(spine-2)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.3(spine-2)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.4(spine-2)              DataVrf1082     655537     655435     13/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5(spine-3)                default         655537     655435     28/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp5.2(spine-3)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5.3(spine-3)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5.4(spine-3)              DataVrf1082     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp6(firewall-1)             default         655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.2(firewall-1)           DataVrf1080     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.3(firewall-1)           DataVrf1081     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.4(firewall-1)           DataVrf1082     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp7                         default         655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.2                       DataVrf1080     655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.3                       DataVrf1081     655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.4                       DataVrf1082     655537     -          NotEstd      Fri Feb 15 17:28:48 2019

Manage Network Events

The NetQ notifier manages the events that occur for the devices and components, protocols and services that it receives from the NetQ Agents. The notifier enables you to capture and filter events that occur to manage the behavior of your network. This is especially useful when an interface or routing protocol goes down and you want to get them back up and running as quickly as possible, preferably before anyone notices or complains. You can improve resolution time significantly by creating filters that focus on topics appropriate for a particular group of users. You can easily create filters around events related to BGP and MLAG session states, interfaces, links, NTP and other services, fans, power supplies, and physical sensor measurements.

For example, for operators responsible for routing, you can create an integration with a notification application that notifies them of routing issues as they occur. This is an example of a Slack message received on a netq-notifier channel indicating that the BGP session on switch leaf04 interface swp2 has gone down.

Timestamps in NetQ

Every event or entry in the NetQ database is stored with a timestamp of when the event was captured by the NetQ Agent on the switch or server. This timestamp is based on the switch or server time where the NetQ Agent is running, and is pushed in UTC format. It is important to ensure that all devices are NTP synchronized to prevent events from being displayed out of order or not displayed at all when looking for events that occurred at a particular time or within a time window.

Interface state, IP addresses, routes, ARP/ND table (IP neighbor) entries and MAC table entries carry a timestamp that represents the time the event happened (such as when a route is deleted or an interface comes up) - except the first time the NetQ agent is run. If the network has been running and stable when a NetQ agent is brought up for the first time, then this time reflects when the agent was started. Subsequent changes to these objects are captured with an accurate time of when the event happened.

Data that is captured and saved based on polling, and just about all other data in the NetQ database, including control plane state (such as BGP or MLAG), has a timestamp of when the information was captured rather than when the event actually happened, though NetQ compensates for this if the data extracted provides additional information to compute a more precise time of the event. For example, BGP uptime can be used to determine when the event actually happened in conjunction with the timestamp.

When retrieving the timestamp, command outputs display the time in three ways:

This example shows the difference between the timestamp displays.

cumulus@switch:~$ netq show bgp
Matching bgp records:
Hostname          Neighbor                     VRF             ASN        Peer ASN   PfxRx        Last Changed
----------------- ---------------------------- --------------- ---------- ---------- ------------ -------------------------
exit-1            swp3(spine-1)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp3.2(spine-1)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.3(spine-1)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.4(spine-1)              DataVrf1082     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4(spine-2)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp4.2(spine-2)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.3(spine-2)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.4(spine-2)              DataVrf1082     655537     655435     13/12/0      Fri Feb 15 17:20:00 2019
...
 
cumulus@switch:~$ netq show agents
Matching agents records:
Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
leaf01            Fresh            yes      2.0.0-cl3u11~1549993210.e902a94      2h:32m:33s                2h:26m:19s                2h:26m:19s                 Tue Feb 12 18:13:28 2019
leaf02            Fresh            yes      2.0.0-cl3u11~1549993210.e902a94      2h:32m:33s                2h:26m:14s                2h:26m:14s                 Tue Feb 12 18:13:33 2019
leaf11            Fresh            yes      2.0.0-ub16.04u11~1549993314.e902a94  2h:32m:28s                2h:25m:49s                2h:25m:49s                 Tue Feb 12 18:17:32 2019
leaf12            Fresh            yes      2.0.0-rh7u11~1549992132.c42c08f      2h:32m:0s                 2h:25m:44s                2h:25m:44s                 Tue Feb 12 18:17:36 2019
leaf21            Fresh            yes      2.0.0-ub16.04u11~1549993314.e902a94  2h:32m:28s                2h:25m:39s                2h:25m:39s                 Tue Feb 12 18:17:42 2019
leaf22            Fresh            yes      2.0.0-rh7u11~1549992132.c42c08f      2h:32m:0s                 2h:25m:35s                2h:25m:35s                 Tue Feb 12 18:17:46 2019
spine01           Fresh            yes      2.0.0-cl3u11~1549993210.e902a94      2h:32m:33s                2h:27m:11s                2h:27m:11s                 Tue Feb 12 18:13:06 2019
spine02           Fresh            yes      2.0.0-cl3u11~1549993210.e902a94      2h:32m:33s                2h:27m:6s                 2h:27m:6s                  Tue Feb 12 18:13:11 2019
...
 
cumulus@switch:~$ netq show agents json
{
    "agents":[
        {
            "status":"Fresh",
            "lastChanged":1549995208.3039999008,
            "reinitializeTime":1549995146.0,
            "hostname":"leaf01",
            "version":"2.0.0-cl3u11~1549993210.e902a94",
            "sysUptime":1549994772.0,
            "ntpSync":"yes",
            "agentUptime":1549995146.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995213.3399999142,
            "reinitializeTime":1549995151.0,
            "hostname":"leaf02",
            "version":"2.0.0-cl3u11~1549993210.e902a94",
            "sysUptime":1549994772.0,
            "ntpSync":"yes",
            "agentUptime":1549995151.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995434.3559999466,
            "reinitializeTime":1549995157.0,
            "hostname":"leaf11",
            "version":"2.0.0-ub16.04u11~1549993314.e902a94",
            "sysUptime":1549994772.0,
            "ntpSync":"yes",
            "agentUptime":1549995157.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995439.3770000935,
            "reinitializeTime":1549995164.0,
            "hostname":"leaf12",
            "version":"2.0.0-rh7u11~1549992132.c42c08f",
            "sysUptime":1549994809.0,
            "ntpSync":"yes",
            "agentUptime":1549995164.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995452.6830000877,
            "reinitializeTime":1549995176.0,
            "hostname":"leaf21",
            "version":"2.0.0-ub16.04u11~1549993314.e902a94",
            "sysUptime":1549994777.0,
            "ntpSync":"yes",
            "agentUptime":1549995176.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995456.4500000477,
            "reinitializeTime":1549995181.0,
            "hostname":"leaf22",
            "version":"2.0.0-rh7u11~1549992132.c42c08f",
            "sysUptime":1549994805.0,
            "ntpSync":"yes",
            "agentUptime":1549995181.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995186.3090000153,
            "reinitializeTime":1549995094.0,
            "hostname":"spine01",
            "version":"2.0.0-cl3u11~1549993210.e902a94",
            "sysUptime":1549994772.0,
            "ntpSync":"yes",
            "agentUptime":1549995094.0
        },
        {
            "status":"Fresh",
            "lastChanged":1549995191.4530000687,
            "reinitializeTime":1549995099.0,
            "hostname":"spine02",
            "version":"2.0.0-cl3u11~1549993210.e902a94",
            "sysUptime":1549994772.0,
            "ntpSync":"yes",
            "agentUptime":1549995099.0
        },
...

If a NetQ Agent is restarted on a device, the timestamps for existing objects are not updated to reflect this new restart time. Their timestamps are preserved relative to the original start time of the Agent. A rare exception is if the device is rebooted between the time it takes the Agent being stopped and restarted; in this case, the time is once again relative to the start time of the Agent.

Exporting NetQ Data

Data from the NetQ Platform can be exported in a couple of ways:

Example Using the CLI

You can check the state of BGP on your network with netq check bgp:

cumulus@leaf01:~$ netq check bgp
Total Nodes: 25, Failed Nodes: 3, Total Sessions: 220 , Failed Sessions: 24,
Hostname          VRF             Peer Name         Peer Hostname     Reason                                        Last Changed
----------------- --------------- ----------------- ----------------- --------------------------------------------- -------------------------
exit01            DataVrf1080     swp6.2            firewall01        BGP session with peer firewall01 swp6.2: AFI/ Tue Feb 12 18:11:16 2019
                                                                      SAFI evpn not activated on peer              
exit01            DataVrf1080     swp7.2            firewall02        BGP session with peer firewall02 (swp7.2 vrf  Tue Feb 12 18:11:27 2019
                                                                      DataVrf1080) failed,                         
                                                                      reason: Peer not configured                  
exit01            DataVrf1081     swp6.3            firewall01        BGP session with peer firewall01 swp6.3: AFI/ Tue Feb 12 18:11:16 2019
                                                                      SAFI evpn not activated on peer              
exit01            DataVrf1081     swp7.3            firewall02        BGP session with peer firewall02 (swp7.3 vrf  Tue Feb 12 18:11:27 2019
                                                                      DataVrf1081) failed,                         
                                                                      reason: Peer not configured                  
...

When you show the output in JSON format, this same command looks like this:

cumulus@leaf01:~$ netq check bgp json
{
    "failedNodes":[
        {
            "peerHostname":"firewall01",
            "lastChanged":1549995080.0,
            "hostname":"exit01",
            "peerName":"swp6.2",
            "reason":"BGP session with peer firewall01 swp6.2: AFI/SAFI evpn not activated on peer",
            "vrf":"DataVrf1080"
        },
        {
            "peerHostname":"firewall02",
            "lastChanged":1549995449.7279999256,
            "hostname":"exit01",
            "peerName":"swp7.2",
            "reason":"BGP session with peer firewall02 (swp7.2 vrf DataVrf1080) failed, reason: Peer not configured",
            "vrf":"DataVrf1080"
        },
        {
            "peerHostname":"firewall01",
            "lastChanged":1549995080.0,
            "hostname":"exit01",
            "peerName":"swp6.3",
            "reason":"BGP session with peer firewall01 swp6.3: AFI/SAFI evpn not activated on peer",
            "vrf":"DataVrf1081"
        },
        {
            "peerHostname":"firewall02",
            "lastChanged":1549995449.7349998951,
            "hostname":"exit01",
            "peerName":"swp7.3",
            "reason":"BGP session with peer firewall02 (swp7.3 vrf DataVrf1081) failed, reason: Peer not configured",
            "vrf":"DataVrf1081"
        },
...
 
    ],
    "summary": {
        "checkedNodeCount": 25,
        "failedSessionCount": 24,
        "failedNodeCount": 3,
        "totalSessionCount": 220
    }
}

Example Using the UI

Open the full screen Switch Inventory card, select the data to export, and click Export.

Important File Locations

The primary configuration file for all Cumulus NetQ tools, netq.yml, resides in /etc/netq by default.

Log files are stored in /var/logs/ by default.

Refer to Investigate NetQ Issues for a complete listing of configuration files and logs for use in issue resolution.

Install NetQ

The complete Cumulus NetQ solution contains several components that must be installed, including the NetQ applications, the database, and the NetQ Agents. NetQ can be deployed in two arrangements:

The NetQ Agents reside on the switches and hosts being monitored in your network.

For the on-premises solution, the NetQ Agents collect and transmit data from the switches and/or hosts back to the NetQ On-premises Appliance or Virtual Machine running the NetQ Platform, which in turn processes and stores the data in its database. This data is then provided for display through several user interfaces.

For the cloud solution, the NetQ Agent function is exactly the same, transmitting collected data, but instead sends it to the NetQ Collector containing only the aggregation and forwarding application. The NetQ Collector then transmits this data to Cumulus Networks cloud-based infrastructure for further processing and storage. This data is then provided for display through the same user interfaces as the on-premises solution. In this solution, the browser interface can be pointed to the local NetQ Cloud Appliance or VM, or directly to netq.cumulusnetworks.com.

Installation Choices

There are several choices that you must make to determine what steps you need to perform to install the NetQ solution. First and foremost, you must determine whether you intend to deploy the solution fully on your premises or if you intend to deploy the cloud solution. Secondly, you must decide whether you are going to deploy a Virtual Machine on your own hardware or use one of the Cumulus NetQ appliances. Thirdly, you also must determine whether you want to install the software on a single server or as a server cluster. Finally, if you have an existing on-premises solution and want to save your existing NetQ data, you must backup that data before installing the new software.

The documentation walks you through these choices and then provides the instructions specific to your selections.

Installation Workflow Summary

No matter how you answer the questions above, the installation workflow can be summarized as follows:

  1. Prepare physical server or virtual machine.
  2. Install the software (NetQ Platform or NetQ Collector).
  3. Install and configure NetQ Agents on switches and hosts.
  4. Install and configure NetQ CLI on switches and hosts (optional, but useful).

Install NetQ System Platform

This topic walks you through the NetQ System Platform installation decisions and then provides installation steps based on those choices. If you are already comfortable with your installation choices, you may use the matrix in Install NetQ Quick Start to go directly to the installation steps.

To install NetQ 3.0.x, you must first decide whether you want to install the NetQ Platform in an on-premises or cloud deployment. Both deployment options provide secure access to data and features useful for monitoring and troubleshooting your network, and each has its benefits.

It is common to select an on-premises deployment model if you want to host all required hardware and software at your location, and you have the in-house skill set to install, configure, and maintain it—including performing data backups, acquiring and maintaining hardware and software, and integration and license management. This model is also a good choice if you want very limited or no access to the Internet from switches and hosts in your network. Some companies simply want complete control of the their network, and no outside impact.

If, however, you find that you want to host only a small server on your premises and leave the details up to Cumulus Networks, then a cloud deployment might be the right choice for you. With a cloud deployment, a small local server connects to the NetQ Cloud service over selected ports or through a proxy server. Only data aggregation and forwarding is supported. The majority of the NetQ applications are hosted and data storage is provided in the cloud. Cumulus handles the backups and maintenance of the application and storage. This model is often chosen when it is untenable to support deployment in-house or if you need the flexibility to scale quickly, while also reducing capital expenses.

Click the deployment model you want to use to continue with installation:

Install NetQ as an On-premises Deployment

On-premises deployments of NetQ can use a single server or a server cluster. In either case, you can use either the Cumulus NetQ Appliance or your own server running a KVM or VMware Virtual Machine (VM). This topic walks you through the installation for each of these on-premises options.

The next installation step is to decide whether you are deploying a single server or a server cluster. Both options provide the same services and features. The biggest difference is in the number of servers to be deployed and in the continued availability of services running on those servers should hardware failures occur.

A single server is easier to set up, configure and manage, but can limit your ability to scale your network monitoring quickly. Multiple servers is a bit more complicated, but you limit potential downtime and increase availability by having more than one server that can run the software and store the data.

Select the standalone single-server arrangements for smaller, simpler deployments. Be sure to consider the capabilities and resources needed on this server to support the size of your final deployment.

Select the server cluster arrangement to obtain scalability and high availability for your network. You can configure one master node and up to nine worker nodes.

Click the server arrangement you want to use to begin installation:

Install NetQ as a Cloud Deployment

Cloud deployments of NetQ can use a single server or a server cluster on site. The NetQ database remains in the cloud either way. You can use either the Cumulus NetQ Cloud Appliance or your own server running a KVM or VMware Virtual Machine (VM). This topic walks you through the installation for each of these cloud options.

The next installation step is to decide whether you are deploying a single server or a server cluster. Both options provide the same services and features. The biggest difference is in the number of servers to be deployed and in the continued availability of services running on those servers should hardware failures occur.

A single server is easier to set up, configure and manage, but can limit your ability to scale your network monitoring quickly. Multiple servers is a bit more complicated, but you limit potential downtime and increase availability by having more than one server that can run the software and store the data.

Click the server arrangement you want to use to begin installation:

Set Up Your KVM Virtual Machine for a Single On-premises Server

Follow these steps to setup and configure your VM on a single server in an on-premises deployment:

  1. Verify that your system meets the VM requirements.

    ResourceMinimum Requirement
    ProcessorEight (8) virtual CPUs
    Memory64 GB RAM
    Local disk storage256 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
    (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
    Network interface speed1 Gb NIC
    HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ Platform:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    6443TCPkube-apiserver
    8443TCPAdmin UI
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway

    Port 32666 is no longer used for the NetQ UI.

  3. Download the NetQ Platform image.

    1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
    2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
    3. Select KVM from the HyperVisor/Platform list.

    4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0.tgz installation package.

  4. Setup and configure your VM.

    Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

    KVM Example Configuration

    This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

    1. Confirm that the SHA256 checksum matches the one posted on the Cumulus Downloads website to ensure the image download has not been corrupted.

      $ sha256sum ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
      $ 58EC6D6B4F2C6D377B3CD7C6E36792C6E2C89B06069561C50F316EA01F8A2ED2 ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
    2. Copy the QCOW2 image to a directory where you want to run it.

      Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

      $ sudo mkdir /vms
      $ sudo cp ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2 /vms/ts.qcow2
    3. Create the VM.

      For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

      $ virt-install --name=netq_ts --vcpus=8 --memory=65536 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

      Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

      Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

      $ virt-install --name=netq_ts --vcpus=8 --memory=65536 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

      Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

      Make note of the name used during install as this is needed in a later step.

    4. Watch the boot process in another terminal window.
      $ virsh console netq_ts
  5. Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  6. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  7. Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

Click the installation and activation method you want to use to complete installation:

Set Up Your KVM Virtual Machine for a Single Cloud Server

Follow these steps to setup and configure your VM on a single server in a cloud deployment:

  1. Verify that your system meets the VM requirements.

    ResourceMinimum Requirement
    ProcessorFour (4) virtual CPUs
    Memory8 GB RAM
    Local disk storageFor NetQ 3.2.x and later: 64 GB
    For NetQ 3.1 and earlier: 32 GB
    Network interface speed1 Gb NIC
    HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ Platform:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    6443TCPkube-apiserver
    8443TCPAdmin UI
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway

    Port 32666 is no longer used for the NetQ UI.

  3. Download the NetQ Platform image.

    1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
    2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
    3. Select KVM (Cloud) from the HyperVisor/Platform list.

    4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0-opta.tgz installation package.

  4. Setup and configure your VM.

    Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

    KVM Example Configuration

    This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

    1. Confirm that the SHA256 checksum matches the one posted on the Cumulus Downloads website to ensure the image download has not been corrupted.

      $ sha256sum ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
      $ 599C3AA617937156D38A2205B4D111F83EBCFD63EDA7A791060375B30CB1DA90 ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
    2. Copy the QCOW2 image to a directory where you want to run it.

      Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

      $ sudo mkdir /vms
      $ sudo cp ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2 /vms/ts.qcow2
    3. Create the VM.

      For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

      $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

      Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

      Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

      $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

      Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

      Make note of the name used during install as this is needed in a later step.

    4. Watch the boot process in another terminal window.
      $ virsh console netq_ts
  5. Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  6. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  7. Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    If you have changed the IP address or hostname of the NetQ Cloud VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM.

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

Click the installation and activation method you want to use to complete installation:

Set Up Your KVM Virtual Machine for an On-premises Server Cluster

First configure the VM on the master node, and then configure the VM on each worker node.

Follow these steps to setup and configure your VM on a cluster of servers in an on-premises deployment:

  1. Verify that your master node meets the VM requirements.

    ResourceMinimum Requirement
    ProcessorEight (8) virtual CPUs
    Memory64 GB RAM
    Local disk storage256 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
    (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
    Network interface speed1 Gb NIC
    HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ Platforms:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    6443TCPkube-apiserver
    8443TCPAdmin UI
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway
    Additionally, for internal cluster communication, you must open these ports:
    PortProtocolComponent Access
    8080TCPAdmin API
    5000TCPDocker registry
    8472UDPFlannel port for VXLAN
    6443TCPKubernetes API server
    10250TCPkubelet health probe
    2379TCPetcd
    2380TCPetcd
    7072TCPKafka JMX monitoring
    9092TCPKafka client
    7071TCPCassandra JMX monitoring
    7000TCPCassandra cluster communication
    9042TCPCassandra client
    7073TCPZookeeper JMX monitoring
    2888TCPZookeeper cluster communication
    3888TCPZookeeper cluster communication
    2181TCPZookeeper client

    Port 32666 is no longer used for the NetQ UI.

  3. Download the NetQ Platform image.

    1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
    2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
    3. Select KVM from the HyperVisor/Platform list.

    4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0.tgz installation package.

  4. Setup and configure your VM.

    Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

    KVM Example Configuration

    This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

    1. Confirm that the SHA256 checksum matches the one posted on the Cumulus Downloads website to ensure the image download has not been corrupted.

      $ sha256sum ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
      $ 58EC6D6B4F2C6D377B3CD7C6E36792C6E2C89B06069561C50F316EA01F8A2ED2 ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
    2. Copy the QCOW2 image to a directory where you want to run it.

      Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

      $ sudo mkdir /vms
      $ sudo cp ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2 /vms/ts.qcow2
    3. Create the VM.

      For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

      $ virt-install --name=netq_ts --vcpus=8 --memory=65536 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

      Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

      Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

      $ virt-install --name=netq_ts --vcpus=8 --memory=65536 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

      Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

      Make note of the name used during install as this is needed in a later step.

    4. Watch the boot process in another terminal window.
      $ virsh console netq_ts
  5. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  6. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  7. Run the Bootstrap CLI on the master node. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  8. Verify that your first worker node meets the VM requirements, as described in Step 1.

  9. Confirm that the needed ports are open for communications, as described in Step 2.

  10. Open your hypervisor and setup the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. It is needed for later installation steps.

  11. Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  12. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  13. Run the Bootstrap CLI on the worker node.

    cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] on the new worker node and then try again.

  14. Repeat Steps 8 through 13 for each additional worker node you want in your cluster.

The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

Click the installation and activation method you want to use to complete installation:

Set Up Your KVM Virtual Machine for a Cloud Server Cluster

First configure the VM on the master node, and then configure the VM on each worker node.

Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:

  1. Verify that your master node meets the VM requirements.

    ResourceMinimum Requirement
    ProcessorFour (4) virtual CPUs
    Memory8 GB RAM
    Local disk storageFor NetQ 3.2.x and later: 64 GB
    For NetQ 3.1 and earlier: 32 GB
    Network interface speed1 Gb NIC
    HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ Platforms:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    6443TCPkube-apiserver
    8443TCPAdmin UI
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway
    Additionally, for internal cluster communication, you must open these ports:
    PortProtocolComponent Access
    8080TCPAdmin API
    5000TCPDocker registry
    8472UDPFlannel port for VXLAN
    6443TCPKubernetes API server
    10250TCPkubelet health probe
    2379TCPetcd
    2380TCPetcd
    7072TCPKafka JMX monitoring
    9092TCPKafka client
    7071TCPCassandra JMX monitoring
    7000TCPCassandra cluster communication
    9042TCPCassandra client
    7073TCPZookeeper JMX monitoring
    2888TCPZookeeper cluster communication
    3888TCPZookeeper cluster communication
    2181TCPZookeeper client

    Port 32666 is no longer used for the NetQ UI.

  3. Download the NetQ Platform image.

    1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
    2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
    3. Select KVM (Cloud) from the HyperVisor/Platform list.

    4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0-opta.tgz installation package.

  4. Setup and configure your VM.

    Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

    KVM Example Configuration

    This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

    1. Confirm that the SHA256 checksum matches the one posted on the Cumulus Downloads website to ensure the image download has not been corrupted.

      $ sha256sum ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
      $ 599C3AA617937156D38A2205B4D111F83EBCFD63EDA7A791060375B30CB1DA90 ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2
    2. Copy the QCOW2 image to a directory where you want to run it.

      Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

      $ sudo mkdir /vms
      $ sudo cp ./Downloads/cumulus-netq-server-3.0.0-ts-amd64-qemu.qcow2 /vms/ts.qcow2
    3. Create the VM.

      For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

      $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

      Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

      Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

      $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

      Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

      Make note of the name used during install as this is needed in a later step.

    4. Watch the boot process in another terminal window.
      $ virsh console netq_ts
  5. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  6. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  7. Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    If you have changed the IP address or hostname of the NetQ Cloud VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM.

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  8. Verify that your first worker node meets the VM requirements, as described in Step 1.

  9. Confirm that the needed ports are open for communications, as described in Step 2.

  10. Open your hypervisor and setup the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. It is needed for later installation steps.

  11. Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  12. Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  13. Run the Bootstrap CLI on the worker node.

    cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset on the new worker node and then try again.

  14. Repeat Steps 8 through 13 for each additional worker node you want in your cluster.

The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

Click the installation and activation method you want to use to complete installation:

Set Up Your VMware Virtual Machine for a Single On-premises Server

Follow these steps to setup and configure your VM on a single server in an on-premises deployment:

  1. Verify that your system meets the VM requirements.

    ResourceMinimum Requirement
    ProcessorEight (8) virtual CPUs
    Memory64 GB RAM
    Local disk storage256 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
    (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
    Network interface speed1 Gb NIC
    HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ Platform:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    6443TCPkube-apiserver
    8443TCPAdmin UI
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway

    Port 32666 is no longer used for the NetQ UI.

  3. Download the NetQ Platform image.

    1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
    2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
    3. Select VMware from the HyperVisor/Platform list.

    4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0.tgz installation package.

  4. Setup and configure your VM.

    Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

    VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
    1. Enter the address of the hardware in your browser.

    2. Log in to VMware using credentials with root access.

    3. Click Storage in the Navigator to verify you have an SSD installed.

    4. Click Create/Register VM at the top of the right pane.

    5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

    6. Provide a name for the VM, for example NetQ.

      Tip: Make note of the name used during install as this is needed in a later step.

    7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

  5. Click Next.

  6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

  7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

  8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

    The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

  9. Once completed, view the full details of the VM and hardware.

  • Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  • The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

    Click the installation and activation method you want to use to complete installation:

    Set Up Your VMware Virtual Machine for a Single Cloud Server

    Follow these steps to setup and configure your VM for a cloud deployment:

    1. Verify that your system meets the VM requirements.

      ResourceMinimum Requirement
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storageFor NetQ 3.2.x and later: 64 GB
      For NetQ 3.1 and earlier: 32 GB
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ Platform:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      6443TCPkube-apiserver
      8443TCPAdmin UI
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway

      Port 32666 is no longer used for the NetQ UI.

    3. Download the NetQ Platform image.

      1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
      2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
      3. Select VMware (Cloud) from the HyperVisor/Platform list.

      4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0-opta.tgz installation package.

    4. Setup and configure your VM.

      Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    If you have changed the IP address or hostname of the NetQ Cloud VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM.

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  • The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

    Click the installation and activation method you want to use to complete installation:

    Set Up Your VMware Virtual Machine for an On-premises Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM cluster for an on-premises deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirement
      ProcessorEight (8) virtual CPUs
      Memory64 GB RAM
      Local disk storage256 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
      (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ Platforms:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      6443TCPkube-apiserver
      8443TCPAdmin UI
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      8472UDPFlannel port for VXLAN
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client

      Port 32666 is no longer used for the NetQ UI.

    3. Download the NetQ Platform image.

      1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
      2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
      3. Select VMware from the HyperVisor/Platform list.

      4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0.tgz installation package.

    4. Setup and configure your VM.

      Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  • Verify that your first worker node meets the VM requirements, as described in Step 1.

  • Confirm that the needed ports are open for communications, as described in Step 2.

  • Open your hypervisor and setup the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. It is needed for later installation steps.

  • Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI on the worker node.

    cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] on the new worker node and then try again.

  • Repeat Steps 8 through 13 for each additional worker node you want in your cluster.

  • The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

    Click the installation and activation method you want to use to complete installation:

    Set Up Your VMware Virtual Machine for a Cloud Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirement
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storageFor NetQ 3.2.x and later: 64 GB
      For NetQ 3.1 and earlier: 32 GB
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ Platforms:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      6443TCPkube-apiserver
      8443TCPAdmin UI
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      8472UDPFlannel port for VXLAN
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client

      Port 32666 is no longer used for the NetQ UI.

    3. Download the NetQ Platform image.

      1. On the MyMellanox Downloads page, select NetQ from the Software -> Cumulus Software list.
      2. Click 3.0 from the Version list, and then select 3.0.0 from the submenu.
      3. Select VMware (Cloud) from the HyperVisor/Platform list.

      4. Scroll down to view the image, and click Download. This downloads the NetQ-3.0.0-opta.tgz installation package.

    4. Setup and configure your VM.

      Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI. Be sure to replace the eth0 interface used in this example with the interface on the server used to listen for NetQ Agents.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    If you have changed the IP address or hostname of the NetQ Cloud VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM.

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the Bootstrap CLI. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
  • Verify that your first worker node meets the VM requirements, as described in Step 1.

  • Confirm that the needed ports are open for communications, as described in Step 2.

  • Open your hypervisor and setup the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. It is needed for later installation steps.

  • Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Run the Bootstrap CLI on the worker node.

    cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

    Allow about five to ten minutes for this to complete, and only then continue to the next step.

    If this step fails for any reason, you can run netq bootstrap reset on the new worker node and then try again.

  • Repeat Steps 8 through 13 for each additional worker node you want in your cluster.

  • The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the CLI.

    Click the installation and activation method you want to use to complete installation:

    Install the NetQ On-premises Appliance

    This topic describes how to prepare your single, NetQ On-premises Appliance for installation of the NetQ Platform software.

    Inside the box that was shipped to you, you’ll find:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.

    Install the Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for the appliance before installing the NetQ software.

    1. Log in to the appliance using the default login credentials:

      • Username: cumulus
      • Password: CumulusLinux!
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: CumulusLinux!
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. Port eno1, is dedicated for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      

    Verify NetQ Software and Appliance Readiness

    Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.

    1. Verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 3.0.0.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0.tgz  netq-bootstrap-3.0.0.tgz
    3. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    4. Run the Bootstrap CLI. Be sure to replace the eno1 interface used in this example with the interface or IP address on the appliance used to listen for NetQ Agents.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

      If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

      If you have changed the IP address or hostname of the NetQ On-premises Appliance after this step, you need to re-register this address with NetQ as follows:

      Reset the appliance, indicating whether you want to purge any NetQ DB data or keep it.

      cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

      Re-run the Bootstrap CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the NetQ CLI.

    Click the installation and activation method you want to use to complete installation:

    Install the NetQ Cloud Appliance

    This topic describes how to prepare your single, NetQ Cloud Appliance for installation of the NetQ Collector software.

    Inside the box that was shipped to you, you’ll find:

    If you’re looking for hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, check out the appliance’s user manual.

    Install the Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    1. Log in to the appliance using the default login credentials:

      • Username: cumulus
      • Password: CumulusLinux!
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: CumulusLinux!
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. Port eno1, is dedicated for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      

    Verify NetQ Software and Appliance Readiness

    Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.

    1. Verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 3.0.0.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0-opta.tgz  netq-bootstrap-3.0.0.tgz
    3. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    4. Run the Bootstrap CLI. Be sure to replace the eno1 interface used in this example with the interface or IP address on the appliance used to listen for NetQ Agents.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

      If this step fails for any reason, you can run netq bootstrap reset and then try again.

      If you have changed the IP address or hostname of the NetQ Cloud Appliance after this step, you need to re-register this address with NetQ as follows:

      Reset the appliance.

      cumulus@hostname:~$ netq bootstrap reset

      Re-run the Bootstrap CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

    The final step is to install and activate the Cumulus NetQ software. You can do this using the Admin UI or the NetQ CLI.

    Click the installation and activation method you want to use to complete installation:

    Install a NetQ On-premises Appliance Cluster

    This topic describes how to prepare your cluster of NetQ On-premises Appliances for installation of the NetQ Platform software.

    Inside each box that was shipped to you, you’ll find:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.

    Install Each Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.

    1. Log in to the appliance that will be your master node using the default login credentials:

      • Username: cumulus
      • Password: CumulusLinux!
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for <user>.
      (current) UNIX password:
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. Port eno1, is dedicated for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      
    5. Repeat these steps for each of the worker node appliances.

    Verify NetQ Software and Appliance Readiness

    Now that the appliances are up and running, verify that the software is available and the appliance is ready for installation.

    1. On the master node, verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 3.0.0.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0.tgz  netq-bootstrap-3.0.0.tgz
    3. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    4. Run the Bootstrap CLI. Be sure to replace the eno1 interface used in this example with the interface or IP address on the appliance used to listen for NetQ Agents.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

      If this step fails for any reason, you can run netq bootstrap reset [purge-db|keep-db] and then try again.

      If you have changed the IP address or hostname of the NetQ On-premises Appliance after this step, you need to re-register this address with NetQ as follows:

      Reset the appliance, indicating whether you want to purge any NetQ DB data or keep it.

      cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

      Re-run the Bootstrap CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
    5. On one or your worker nodes, verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    6. Configure the IP address, hostname, and password using the same steps as for the master node. Refer to Configure the Password, Hostname and IP Address.

      Make a note of the private IP addresses you assign to the master and worker nodes. They are needed for the later installation steps.

    7. Verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    8. Verify that the needed files are present and of the correct release.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0.tgz  netq-bootstrap-3.0.0.tgz
    9. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    10. Run the Bootstrap CLI on the worker node.

      cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

    11. Repeat Steps 5-10 for each additional worker node (NetQ On-premises Appliance).

    The final step is to install and activate the Cumulus NetQ software on each appliance in your cluster. You can do this using the Admin UI or the NetQ CLI.

    Click the installation and activation method you want to use to complete installation:

    Install a NetQ Cloud Appliance Cluster

    This topic describes how to prepare your cluster of NetQ Cloud Appliances for installation of the NetQ Collector software.

    Inside each box that was shipped to you, you’ll find:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, refer to the user manual.

    Install Each Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.

    1. Log in to the appliance that will be your master node using the default login credentials:

      • Username: cumulus
      • Password: CumulusLinux!
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for <user>.
      (current) UNIX password:
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. Port eno1, is dedicated for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      
    5. Repeat these steps for each of the worker node appliances.

    Verify NetQ Software and Appliance Readiness

    Now that the appliances are up and running, verify that the software is available and each appliance is ready for installation.

    1. On the master NetQ Cloud Appliance, verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 3.0.0.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0-opta.tgz  netq-bootstrap-3.0.0.tgz
    3. Verify the master NetQ Cloud Appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    4. Run the Bootstrap CLI. Be sure to replace the eno1 interface used in this example with the interface or IP address on the appliance used to listen for NetQ Agents.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

      If this step fails for any reason, you can run netq bootstrap reset and then try again.

      If you have changed the IP address or hostname of the NetQ Cloud Appliance after this step, you need to re-register this address with NetQ as follows:

      Reset the appliance.

      cumulus@hostname:~$ netq bootstrap reset

      Re-run the Bootstrap CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

      cumulus@:~$ netq bootstrap master interface eno1 tarball /mnt/installables/netq-bootstrap-3.0.0.tgz
    5. On one of your worker NetQ Cloud Appliances, verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    6. Configure the IP address, hostname, and password using the same steps as as for the master node. Refer to Configure the Password, Hostname, and IP Address.

      Make a note of the private IP addresses you assign to the master and worker nodes. They are needed for later installation steps.

    7. Verify that the needed packages are present and of the correct release, version 3.0.0 and update 27 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
    8. Verify that the needed files are present and of the correct release.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-3.0.0-opta.tgz  netq-bootstrap-3.0.0.tgz
    9. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    10. Run the Bootstrap CLI on the worker node.

      cumulus@:~$ netq bootstrap worker tarball /mnt/installables/netq-bootstrap-3.0.0.tgz master-ip <master-ip> [password <text-password>]

      Allow about five to ten minutes for this to complete, and only then continue to the next step.

      If this step fails for any reason, you can run netq bootstrap reset on the new worker node and then try again.

    11. Repeat Steps 5-10 for each additional worker NetQ Cloud Appliance.

    The final step is to install and activate the Cumulus NetQ software on each appliance in your cluster. You can do this using the Admin UI or the CLI.

    Click the installation and activation method you want to use to complete installation:

    Prepare Your Existing NetQ Appliances for a NetQ 3.0 Deployment

    This topic describes how to prepare a NetQ 2.4.x or earlier NetQ Appliance before installing NetQ 3.0.x. The steps are the same for both the on-premises and cloud appliances. The only difference is the software you download for each platform. On completion of the steps included here, you will be ready to perform a fresh installation of NetQ 3.0.x.

    The preparation workflow is summarized in this figure:

    To prepare your appliance:

    1. Verify that your appliance is a supported hardware model.

    2. For on-premises solutions using the NetQ On-premises Appliance, optionally back up your NetQ data.

      1. Run the backup script to create a backup file in /opt/<backup-directory>.

        Be sure to replace the backup-directory option with the name of the directory you want to use for the backup file. This location must be somewhere that is off of the appliance to avoid it being overwritten during these preparation steps.

        cumulus@<hostname>:~$ ./backuprestore.sh --backup --localdir /opt/<backup-directory>
        
      2. Verify the backup file has been created.

        cumulus@<hostname>:~$ cd /opt/<backup-directory>
        cumulus@<hostname>:~/opt/<backup-directory># ls
        netq_master_snapshot_2020-01-09_07_24_50_UTC.tar.gz
        
    3. Install Ubuntu 18.04 LTS

      Follow the instructions here to install Ubuntu.

      Note these tips when installing:

      • Ignore the instructions for MAAS.

      • Ubuntu OS should be installed on the SSD disk. Select Micron SSD with ~900 GB at step#9 in the aforementioned instructions.

      • Set the default username to cumulus and password to CumulusLinux!.

      • When prompted, select Install SSH server.

    4. Configure networking.

      Ubuntu uses Netplan for network configuration. You can give your appliance an IP address using DHCP or a static address.

      • Create and/or edit the /etc/netplan/01-ethernet.yaml Netplan configuration file.

        # This file describes the network interfaces available on your system
        # For more information, see netplan(5).
        network:
            version: 2
            renderer: networkd
            ethernets:
                eno1:
                    dhcp4: yes
        
      • Apply the settings.

        $ sudo netplan apply
        
      • Create and/or edit the  /etc/netplan/01-ethernet.yaml Netplan configuration file.

        In this example the interface, eno1, is given a static IP address of 192.168.1.222 with a gateway at 192.168.1.1 and DNS server at 8.8.8.8 and 8.8.4.4.

        # This file describes the network interfaces available on your system
        # For more information, see netplan(5).
        network:
            version: 2
            renderer: networkd
            ethernets:
                eno1:
                    dhcp4: no
                    addresses: [192.168.1.222/24]
                    gateway4: 192.168.1.1
                    nameservers:
                        addresses: [8.8.8.8,8.8.4.4]
        
      • Apply the settings.

        $ sudo netplan apply
        
    5. Update the Ubuntu repository.

      1. Reference and update the local apt repository.

        root@ubuntu:~# wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | apt-key add -
        
      2. Add the Ubuntu 18.04 repository.

        Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:

        root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
        ...
        deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
        ...
        

        The use of netq-latest in this example means that a get to the repository always retrieves the latest version of NetQ, even in the case where a major version update has been made. If you want to keep the repository on a specific version - such as netq-2.2 - use that instead.

    6. Install Python.

      Run the following commands:

      root@ubuntu:~# apt-get update
      root@ubuntu:~# apt-get install python python2.7 python-apt python3-lib2to3 python3-distutils
      
    7. Obtain the latest NetQ Agent and CLI package.

      Run the following commands:

      root@ubuntu:~# apt-get update
      root@ubuntu:~# apt-get install netq-agent netq-apps
      
    8. Download the bootstrap and NetQ installation tarballs.

      Download the software from the MyMellanox downloads page page.

      1. Select NetQ from the Product list.

      2. Select 3.0 from the Version list, and then select 3.0.0 from the submenu.

      3. Select Bootstrap from the Hypervisor/Platform list. Note that the bootstrap file is the same for both appliances.

      4. Scroll down and click Download.

      5. Select Appliance for the NetQ On-premises Appliance or Appliance (Cloud) for the NetQ Cloud Appliance from the Hypervisor/Platform list.

        Make sure you select the right install choice based on whether you are preparing the on-premises or cloud version of the appliance.

      6. Scroll down and click Download.

      7. Copy these two files, netq-bootstrap-3.0.0.tgz and either NetQ-3.0.0.tgz (on-premises) or NetQ-3.0.0-opta.tgz (cloud), to the /mnt/installables/ directory on the appliance.

      8. Verify that the needed files are present and of the correct release. This example shows on-premises files. The only difference for cloud files is that it should list NetQ-3.0.0-opta.tgz instead of NetQ-3.0.0.tgz.

        cumulus@<hostname>:~$ dpkg -l | grep netq
        ii  netq-agent   3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Telemetry Agent for Ubuntu
        ii  netq-apps    3.0.0-ub18.04u27~1588242914.9fb5b87_amd64   Cumulus NetQ Fabric Validation Application for Ubuntu
        
        cumulus@<hostname>:~$ cd /mnt/installables/
        cumulus@<hostname>:/mnt/installables$ ls
        NetQ-3.0.0.tgz  netq-bootstrap-3.0.0.tgz
        
      9. Run the following commands.

        sudo systemctl disable apt-{daily,daily-upgrade}.{service,timer}
        sudo systemctl stop apt-{daily,daily-upgrade}.{service,timer}
        sudo systemctl disable motd-news.{service,timer}
        sudo systemctl stop motd-news.{service,timer}
        
    9. Run the Bootstrap CLI.

      Run the bootstrap CLI on your appliance. Be sure to replace the eth0 interface used in this example with the interface or IP address on the appliance used to listen for NetQ Agents.

    If you are creating a server cluster, you need to prepare each of those appliances as well. Repeat these steps if you are using a previously deployed appliance or refer to Install NetQ System Platform for a new appliance.

    You are now ready to install the NetQ Software. Refer to Install NetQ Using the Admin UI (recommended) or Install NetQ Using the CLI.

    Install NetQ Using the Admin UI

    You can now install the NetQ software using the Admin UI.

    This is the final set of steps for installing NetQ. If you have not already performed the installation preparation steps, go to Install NetQ System Platform before continuing here.

    To install NetQ:

    1. Log in to your NetQ On-premises Appliance, NetQ Cloud Appliance, the master node of your cluster, or VM.

      In your browser address field, enter https://<hostname-or-ipaddr>:8443.

      This opens the Admin UI.

    2. Step through the UI.

      Having made your installation choices during the preparation steps, you can quickly select the correct path through the UI.

      1. Select your deployment type.

        Choose which type of deployment model you want to use. Both options provide secure access to data and features useful for monitoring and troubleshooting your network.

      2. Select your install method.

        Choose between restoring data from a previous version of NetQ or performing a fresh installation.

        Fresh Install: Continue with Step c.

        Maintain Existing Data (on-premises only): If you have created a backup of your NetQ data, choose this option.

        If you are moving from a standalone to a server cluster arrangement, you can only restore your data one time. After the data has been converted to the cluster schema, it cannot be returned to the single server format.

      3. Select your server arrangement.

        Select whether you want to deploy your infrastructure as a single stand-alone server or as a cluster of servers. One master and two worker nodes are supported for the cluster deployment.

        Select arrangement

        Select arrangement

        If you select a server cluster, use the private IP addresses that you used when setting up the worker nodes to add those nodes.

        Add worker nodes to a server cluster

        Add worker nodes to a server cluster

      4. Install NetQ software.

        You install the NetQ software using the installation files (NetQ-3.0.0.tgz for on-premises deployments or NetQ-3.0.0-opta.tgz for cloud deployments) that you downloaded and stored previously.

        Enter the appropriate filename in the field provided.

      5. Activate NetQ.

        This final step activates the software and enables you to view the health of your NetQ system. For cloud deployments, you must enter your configuration key.

        On-premises activation

        On-premises activation

        Cloud activation

        Cloud activation

      6. View the system health.

        When the installation and activation is complete, the NetQ System Health dashboard is visible for tracking the status of key components in the system. Single server deployments display two cards, one for the server, and one for Kubernetes pods. Server cluster deployments display additional cards, including one each for the Cassandra database, Kafka, and Zookeeper services.

        On-premises deployment

        On-premises deployment

    Install NetQ Using the CLI

    You can now install the NetQ software using the NetQ CLI.

    This is the final set of steps for installing NetQ. If you have not already performed the installation preparation steps, go to Install NetQ System Platform before continuing here.

    To install NetQ:

    1. Log in to your NetQ platform server, NetQ Appliance, NetQ Cloud Appliance or the master node of your cluster.

    2. Install the software.

      Run the following command on your NetQ platform server or NetQ Appliance:

      cumulus@hostname:~$ netq install standalone full interface eth0 bundle /mnt/installables/NetQ-3.0.0.tgz
      

      You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

      Run the netq show opta-health command to verify all applications are operating properly. Please allow 10-15 minutes for all applications to come up and report their status.

      cumulus@hostname:~$ netq show opta-health
      Application                                            Status    Namespace      Restarts    Timestamp
      -----------------------------------------------------  --------  -------------  ----------  ------------------------
      cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
      cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
      kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
      kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
      netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
      ...
      

      If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

      Run the following commands on your master node, using the IP addresses of your worker nodes:

      cumulus@<hostname>:~$ netq install cluster full interface eth0 bundle /mnt/installables/NetQ-3.0.0.tgz workers <worker-1-ip> <worker-2-ip>
      

      You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface eth0 above.

      Run the netq show opta-health command to verify all applications are operating properly. Please allow 10-15 minutes for all applications to come up and report their status.

      cumulus@hostname:~$ netq show opta-health
      Application                                            Status    Namespace      Restarts    Timestamp
      -----------------------------------------------------  --------  -------------  ----------  ------------------------
      cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
      cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
      kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
      kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
      netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
      netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
      ...
      

      If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

      Run the following command on your NetQ Cloud Appliance with the config-key sent by Cumulus Networks in an email titled “A new site has been added to your Cumulus NetQ account.”

      cumulus@<hostname>:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-3.0.0-opta.tgz config-key <your-config-key-from-email> proxy-host <proxy-hostname> proxy-port <proxy-port>
      

      You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface eth0 above.

      Run the netq show opta-health command to verify all applications are operating properly.

      cumulus@hostname:~$ netq show opta-health
      OPTA is healthy
      

      Run the following commands on your master NetQ Cloud Appliance with the config-key sent by Cumulus Networks in an email titled “A new site has been added to your Cumulus NetQ account.”

      cumulus@<hostname>:~$ netq install opta cluster full interface eth0 bundle /mnt/installables/NetQ-3.0.0-opta.tgz config-key <your-config-key-from-email> workers <worker-1-ip> <worker-2-ip> proxy-host <proxy-hostname> proxy-port <proxy-port>
      

      You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface eth0 above.

      Run the netq show opta-health command to verify all applications are operating properly.

      cumulus@hostname:~$ netq show opta-health
      OPTA is healthy
      

    Install NetQ Quick Start

    If you know how you would answer the key installation questions, you can go directly to the instructions for those choices using the table here.

    Do not skip the normal installation flow until you have performed this process multiple times and are fully familiar with it.

    Deployment TypeServer ArrangementSystemHypervisorInstallation Instructions
    On premisesSingle serverCumulus NetQ ApplianceNAStart Install
    On premisesSingle serverOwn Hardware plus VMKVMStart Install
    On premisesSingle serverOwn Hardware plus VMVMwareStart Install
    On premisesServer clusterCumulus NetQ ApplianceNAStart Install
    On premisesServer clusterOwn Hardware plus VMKVMStart Install
    On premisesServer clusterOwn Hardware plus VMVMwareStart Install
    CloudSingle serverCumulus NetQ Cloud ApplianceNAStart Install
    CloudSingle serverOwn Hardware plus VMKVMStart Install
    CloudSingle serverOwn Hardware plus VMVMwareStart Install
    CloudServer clusterCumulus NetQ Cloud ApplianceNAStart Install
    CloudServer clusterOwn Hardware plus VMKVMStart Install
    CloudServer clusterOwn Hardware plus VMVMwareStart Install

    Install NetQ Agents

    After installing your Cumulus NetQ 3.0.0 software, you should install the corresponding NetQ 3.0.0 Agent on each switch and server you want to monitor. There are important features and fixes included in the NetQ Agent with each release.

    Use the instructions in the following sections based on the OS installed on the switch or server.

    Install and Configure the NetQ Agent on Cumulus Linux Switches

    After installing your Cumulus NetQ software, you should install the NetQ 3.0.0 Agents on each switch you want to monitor. NetQ Agents can be installed on switches running:

    Prepare for NetQ Agent Installation on a Cumulus Linux Switch

    For servers running Cumulus Linux, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the Cumulus Networks repository.

    Verify NTP is Installed and Configured

    Verify that NTP is running on the switch. The switch must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

    cumulus@switch:~$ sudo systemctl status ntp
    [sudo] password for cumulus:
    ● ntp.service - LSB: Start NTP daemon
            Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
            Active: active (running) since Fri 2018-06-01 13:49:11 EDT; 2 weeks 6 days ago
              Docs: man:systemd-sysv-generator(8)
            CGroup: /system.slice/ntp.service
                    └─2873 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -c /var/lib/ntp/ntp.conf.dhcp -u 109:114
    

    If NTP is not installed, install and configure it before continuing.

    If NTP is not running:

    If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the Cumulus Networks repository.

    To obtain the NetQ Agent package:

    Edit the /etc/apt/sources.list file to add the repository for Cumulus NetQ.

    Note that NetQ has a separate repository from Cumulus Linux.

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-3.0
    ...
    

    The repository deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-latest can be used if you want to always retrieve the latest posted version of NetQ.

    Add the repository:

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-3.0
    ...
    

    The repository deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest can be used if you want to always retrieve the latest posted version of NetQ.

    Add the apps3.cumulusnetworks.com authentication key to Cumulus Linux:

    cumulus@switch:~$ wget -qO - https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | sudo apt-key add -
    

    Install the NetQ Agent on Cumulus Linux Switch

    After completing the preparation steps, you can successfully install the agent onto your switch.

    To install the NetQ Agent:

    1. Update the local apt repository, then install the NetQ software on the switch.

      cumulus@switch:~$ sudo apt-get update
      cumulus@switch:~$ sudo apt-get install netq-agent
      
    2. Verify you have the correct version of the Agent.

      cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • Cumulus Linux 3.3.2-3.7.x
        • netq-agent_3.0.0-cl3u27~1588048439.0e20d33_armel.deb
        • netq-agent_3.0.0-cl3u27~1588242701.9fb5b87_amd64.deb
      • Cumulus Linux 4.0.0 and later
        • netq-agent_3.0.0-cl4u27~1588048918.0e20d335_armel.deb
        • netq-agent_3.0.0-cl4u27~1588242814.9fb5b87d_amd64.deb
    3. Restart rsyslog so log files are sent to the correct destination.

      cumulus@switch:~$ sudo systemctl restart rsyslog.service
      
    4. Continue with NetQ Agent configuration in the next section.

    Configure the NetQ Agent on a Cumulus Linux Switch

    After the NetQ Agent and CLI have been installed on the servers you want to monitor, the NetQ Agents must be configured to obtain useful and relevant data.

    The NetQ Agent is aware of and communicates through the designated VRF. If you do not specify one, the default VRF (named default) is used. If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.

    Two methods are available for configuring a NetQ Agent:

    Configure NetQ Agents Using a Configuration File

    You can configure the NetQ Agent in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:

      cumulus@switch:~$ sudo nano /etc/netq/netq.yml
      
    2. Locate the netq-agent section, or add it.

    3. Set the parameters for the agent as follows:

      • port: 31980 (default configuration)
      • server: IP address of the NetQ Appliance or VM where the agent should send its collected data
      • vrf: default (default) or one that you specify

      Your configuration should be similar to this:

      netq-agent:
      port: 31980
      server: 127.0.0.1
      vrf: default
      

    Configure NetQ Agents Using the NetQ CLI

    If the CLI is configured, you can use it to configure the NetQ Agent to send telemetry data to the NetQ Appliance or VM. To configure the NetQ CLI, refer to Install and Configure the NetQ CLI on Cumulus Linux Switches.

    If you intend to use VRF, refer to Configure the Agent to Use VRF. If you intend to specify a port for communication, refer to Configure the Agent to Communicate over a Specific Port.

    Use the following command to configure the NetQ Agent:

    netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
    

    This example uses an IP address of 192.168.1.254 and the default port and VRF for the NetQ Appliance or VM.

    cumulus@switch:~$ sudo netq config add agent server 192.168.1.254
    Updated agent server 192.168.1.254 vrf default. Please restart netq-agent (netq config restart agent).
    cumulus@switch:~$ sudo netq config restart agent
    

    Configure Advanced NetQ Agent Settings on a Cumulus Linux Switch

    A couple of additional options are available for configuring the NetQ Agent. If you are using VRF, you can configure the agent to communicate over a specific VRF. You can also configure the agent to use a particular port.

    Configure the Agent to Use a VRF

    While optional, Cumulus strongly recommends that you configure NetQ Agents to communicate with the NetQ Appliance or VM only via a VRF, including a management VRF. To do so, you need to specify the VRF name when configuring the NetQ Agent. For example, if the management VRF is configured and you want the agent to communicate with the NetQ Appliance or VM over it, configure the agent like this:

    cumulus@leaf01:~$ sudo netq config add agent server 192.168.1.254 vrf mgmt
    cumulus@leaf01:~$ sudo netq config restart agent
    

    Configure the Agent to Communicate over a Specific Port

    By default, NetQ uses port 31980 for communication between the NetQ Appliance or VM and NetQ Agents. If you want the NetQ Agent to communicate with the NetQ Appliance or VM via a different port, you need to specify the port number when configuring the NetQ Agent, like this:

    cumulus@leaf01:~$ sudo netq config add agent server 192.168.1.254 port 7379
    cumulus@leaf01:~$ sudo netq config restart agent
    

    Install and Configure the NetQ Agent on Ubuntu Servers

    After installing your Cumulus NetQ software, you should install the NetQ 3.0.0 Agent on each server you want to monitor. NetQ Agents can be installed on servers running:

    Prepare for NetQ Agent Installation on an Ubuntu Server

    For servers running Ubuntu OS, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the agent package on the Cumulus Networks repository.

    Verify Service Package Versions

    The following packages, while not required for installation of the NetQ Agent, must be installed and running for proper operation of the NetQ Agent on an Ubuntu server:

    Verify the Server is Running lldpd

    Make sure you are running lldpd, not lldpad. Ubuntu does not include lldpd by default, which is required for the installation.

    To install this package, run the following commands:

    root@ubuntu:~# sudo apt-get update
    root@ubuntu:~# sudo apt-get install lldpd
    root@ubuntu:~# sudo systemctl enable lldpd.service
    root@ubuntu:~# sudo systemctl start lldpd.service
    

    Install and Configure Network Time Server

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

      root@ubuntu:~# sudo apt-get install ntp
      
    2. Configure the network time server.

      1. Open the /etc/ntp.conf file in your text editor of choice.

      2. Under the Server section, specify the NTP server IP address or hostname.

      3. Enable and start the NTP service.

        root@ubuntu:~# sudo systemctl enable ntp
        root@ubuntu:~# sudo systemctl start ntp
        

      If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

      1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock is synchronized.

        root@ubuntu:~# ntpq -pn
        remote           refid            st t when poll reach   delay   offset  jitter
        ==============================================================================
        +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
        +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
        2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
        \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
        
      1. Install chrony if needed.

        root@ubuntu:~# sudo apt install chrony
        
      2. Start the chrony service.

        root@ubuntu:~# sudo /usr/local/sbin/chronyd
        
      3. Verify it installed successfully.

        root@ubuntu:~# chronyc activity
        200 OK
        8 sources online
        0 sources offline
        0 sources doing burst (return to online)
        0 sources doing burst (return to offline)
        0 sources with unknown address
        
      4. View the time servers chrony is using.

        root@ubuntu:~# chronyc sources
        210 Number of sources = 8
        
        MS Name/IP address         Stratum Poll Reach LastRx Last sample
        ===============================================================================
        ^+ golem.canonical.com           2   6   377    39  -1135us[-1135us] +/-   98ms
        ^* clock.xmission.com            2   6   377    41  -4641ns[ +144us] +/-   41ms
        ^+ ntp.ubuntu.net              2   7   377   106   -746us[ -573us] +/-   41ms
        ...
        

        Open the chrony.conf configuration file (by default at /etc/chrony/) and edit if needed.

        Example with individual servers specified:

        server golem.canonical.com iburst
        server clock.xmission.com iburst
        server ntp.ubuntu.com iburst
        driftfile /var/lib/chrony/drift
        makestep 1.0 3
        rtcsync
        

        Example when using a pool of servers:

        pool pool.ntp.org iburst
        driftfile /var/lib/chrony/drift
        makestep 1.0 3
        rtcsync
        
      5. View the server chrony is currently tracking.

        root@ubuntu:~# chronyc tracking
        Reference ID    : 5BBD59C7 (golem.canonical.com)
        Stratum         : 3
        Ref time (UTC)  : Mon Feb 10 14:35:18 2020
        System time     : 0.0000046340 seconds slow of NTP time
        Last offset     : -0.000123459 seconds
        RMS offset      : 0.007654410 seconds
        Frequency       : 8.342 ppm slow
        Residual freq   : -0.000 ppm
        Skew            : 26.846 ppm
        Root delay      : 0.031207654 seconds
        Root dispersion : 0.001234590 seconds
        Update interval : 115.2 seconds
        Leap status     : Normal
        

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each server. This is available from the Cumulus Networks repository.

    To obtain the NetQ Agent package:

    1. Reference and update the local apt repository.
    root@ubuntu:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | apt-key add -
    
    1. Add the Ubuntu repository:

      Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-xenial.list and add the following line:

      root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-xenial.list
      ...
      deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb xenial netq-latest
      ...
      

      Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:

      root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
      ...
      deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
      ...
      

      The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even in the case where a major version update has been made. If you want to keep the repository on a specific version - such as netq-2.4 - use that instead.

    Install NetQ Agent on an Ubuntu Server

    After completing the preparation steps, you can successfully install the agent software onto your server.

    To install the NetQ Agent:

    1. Install the software packages on the server.

      root@ubuntu:~# sudo apt-get update
      root@ubuntu:~# sudo apt-get install netq-agent
      
    2. Verify you have the correct version of the Agent.

      root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • netq-agent_3.0.0-ub18.04u27~1588242914.9fb5b87_amd64.deb
      • netq-agent_3.0.0-ub16.04u27~1588242914.9fb5b87_amd64.deb
    3. Restart rsyslog so log files are sent to the correct destination.

    root@ubuntu:~# sudo systemctl restart rsyslog.service
    
    1. Continue with NetQ Agent Configuration in the next section.

    Configure the NetQ Agent on an Ubuntu Server

    After the NetQ Agent and CLI have been installed on the servers you want to monitor, the NetQ Agents must be configured to obtain useful and relevant data.

    The NetQ Agent is aware of and communicates through the designated VRF. If you do not specify one, the default VRF (named default) is used. If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.

    Two methods are available for configuring a NetQ Agent:

    Configure the NetQ Agents Using a Configuration File

    You can configure the NetQ Agent in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:
    root@ubuntu:~# sudo nano /etc/netq/netq.yml
    
    1. Locate the netq-agent section, or add it.

    2. Set the parameters for the agent as follows:

    Your configuration should be similar to this:

    netq-agent:
        port: 31980
        server: 127.0.0.1
        vrf: default
    

    Configure NetQ Agents Using the NetQ CLI

    If the CLI is configured, you can use it to configure the NetQ Agent to send telemetry data to the NetQ Server or Appliance. If it is not configured, refer to Configure the NetQ CLI on an Ubuntu Server and then return here.

    If you intend to use VRF, skip to Configure the Agent to Use VRF. If you intend to specify a port for communication, skip to Configure the Agent to Communicate over a Specific Port.

    Use the following command to configure the NetQ Agent:

    netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
    

    This example uses an IP address of 192.168.1.254 and the default port and VRF for the NetQ hardware.

    root@ubuntu:~# sudo netq config add agent server 192.168.1.254
    Updated agent server 192.168.1.254 vrf default. Please restart netq-agent (netq config restart agent).
    root@ubuntu:~# sudo netq config restart agent
    

    Configure Advanced NetQ Agent Settings

    A couple of additional options are available for configuring the NetQ Agent. If you are using VRF, you can configure the agent to communicate over a specific VRF. You can also configure the agent to use a particular port.

    Configure the NetQ Agent to Use a VRF

    While optional, Cumulus strongly recommends that you configure NetQ Agents to communicate with the NetQ Platform only via a VRF, including a management VRF. To do so, you need to specify the VRF name when configuring the NetQ Agent. For example, if the management VRF is configured and you want the agent to communicate with the NetQ Platform over it, configure the agent like this:

    root@ubuntu:~# sudo netq config add agent server 192.168.1.254 vrf mgmt
    root@ubuntu:~# sudo netq config restart agent
    

    Configure the NetQ Agent to Communicate over a Specific Port

    By default, NetQ uses port 31980 for communication between the NetQ Platform and NetQ Agents. If you want the NetQ Agent to communicate with the NetQ Platform via a different port, you need to specify the port number when configuring the NetQ Agent like this:

    root@ubuntu:~# sudo netq config add agent server 192.168.1.254 port 7379
    root@ubuntu:~# sudo netq config restart agent
    

    Install and Configure the NetQ Agent on RHEL and CentOS Servers

    After installing your Cumulus NetQ software, you should install the NetQ 3.0.0 Agents on each server you want to monitor. NetQ Agents can be installed on servers running:

    Prepare for NetQ Agent Installation on a RHEL or CentOS Server

    For servers running RHEL or CentOS, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the Cumulus Networks repository.

    Verify Service Package Versions

    The following packages, while not required for installation of the NetQ Agent, must be installed and running for proper operation of the NetQ Agent on a Red Hat or CentOS server:

    Verify the Server is Running lldpd and wget

    Make sure you are running lldpd, not lldpad. CentOS does not include lldpd by default, nor does it include wget, which is required for the installation.

    To install this package, run the following commands:

    root@rhel7:~# sudo yum -y install epel-release
    root@rhel7:~# sudo yum -y install lldpd
    root@rhel7:~# sudo systemctl enable lldpd.service
    root@rhel7:~# sudo systemctl start lldpd.service
    root@rhel7:~# sudo yum install wget
    

    Install and Configure NTP

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

      root@rhel7:~# sudo yum install ntp
      
    2. Configure the NTP server.

      1. Open the /etc/ntp.conf file in your text editor of choice.

      2. Under the Server section, specify the NTP server IP address or hostname.

    3. Enable and start the NTP service.

      root@rhel7:~# sudo systemctl enable ntp
      root@rhel7:~# sudo systemctl start ntp
      

      If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    4. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock is synchronized.

      root@rhel7:~# ntpq -pn
      remote           refid            st t when poll reach   delay   offset  jitter
      ==============================================================================
      +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
      +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
      2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
      \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
      

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the Cumulus Networks repository.

    To obtain the NetQ Agent package:

    1. Reference and update the local yum repository.

      root@rhel7:~# sudo rpm --import https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm.pubkey
      root@rhel7:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm-el7.repo > /etc/yum.repos.d/cumulus-host-el.repo
      
    2. Edit /etc/yum.repos.d/cumulus-host-el.repo to set the enabled=1 flag for the two NetQ repositories.

      root@rhel7:~# vi /etc/yum.repos.d/cumulus-host-el.repo
      ...
      [cumulus-arch-netq-3.0]
      name=Cumulus netq packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-3.0/$basearch
      gpgcheck=1
      enabled=1
      [cumulus-noarch-netq-3.0]
      name=Cumulus netq architecture-independent packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-3.0/noarch
      gpgcheck=1
      enabled=1
      ...
      

    Install NetQ Agent on a RHEL or CentOS Server

    After completing the preparation steps, you can successfully install the agent software onto your server.

    To install the NetQ Agent:

    1. Install the Bash completion and NetQ packages on the server.

      root@rhel7:~# sudo yum -y install bash-completion
      root@rhel7:~# sudo yum install netq-agent
      
    2. Verify you have the correct version of the Agent.

      root@rhel7:~# rpm -qa | grep -i netq
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • netq-agent_3.0.0-rh7u27~1588244328.9fb5b87.x86_64.rpm
    3. Restart rsyslog so log files are sent to the correct destination.

      root@rhel7:~# sudo systemctl restart rsyslog
      
    4. Continue with NetQ Agent Configuration in the next section.

    Configure the NetQ Agent on a RHEL or CentOS Server

    After the NetQ Agent and CLI have been installed on the servers you want to monitor, the NetQ Agents must be configured to obtain useful and relevant data.

    The NetQ Agent is aware of and communicates through the designated VRF. If you do not specify one, the default VRF (named default) is used. If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.

    Two methods are available for configuring a NetQ Agent:

    Configure the NetQ Agents Using a Configuration File

    You can configure the NetQ Agent in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:

      root@rhel7:~# sudo nano /etc/netq/netq.yml
      
    2. Locate the netq-agent section, or add it.

    3. Set the parameters for the agent as follows:

      • port: 31980 (default) or one that you specify
      • server: IP address of the NetQ server or appliance where the agent should send its collected data
      • vrf: default (default) or one that you specify

      Your configuration should be similar to this:

      netq-agent:
      port: 31980
      server: 127.0.0.1
      vrf: default
      

    Configure NetQ Agents Using the NetQ CLI

    If the CLI is configured, you can use it to configure the NetQ Agent to send telemetry data to the NetQ Server or Appliance. If it is not configured, refer to Configure the NetQ CLI on a RHEL or CentOS Server and then return here.

    If you intend to use VRF, skip to Configure the Agent to Use VRF. If you intend to specify a port for communication, skip to Configure the Agent to Communicate over a Specific Port.

    Use the following command to configure the NetQ Agent:

    netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
    

    This example uses an IP address of 192.168.1.254 and the default port and VRF for the NetQ hardware.

    root@rhel7:~# sudo netq config add agent server 192.168.1.254
    Updated agent server 192.168.1.254 vrf default. Please restart netq-agent (netq config restart agent).
    root@rhel7:~# sudo netq config restart agent
    

    Configure Advanced NetQ Agent Settings

    A couple of additional options are available for configuring the NetQ Agent. If you are using VRF, you can configure the agent to communicate over a specific VRF. You can also configure the agent to use a particular port.

    Configure the NetQ Agent to Use a VRF

    While optional, Cumulus strongly recommends that you configure NetQ Agents to communicate with the NetQ Platform only via a VRF, including a management VRF. To do so, you need to specify the VRF name when configuring the NetQ Agent. For example, if the management VRF is configured and you want the agent to communicate with the NetQ Platform over it, configure the agent like this:

    root@rhel7:~# sudo netq config add agent server 192.168.1.254 vrf mgmt
    root@rhel7:~# sudo netq config restart agent
    

    Configure the NetQ Agent to Communicate over a Specific Port

    By default, NetQ uses port 31980 for communication between the NetQ Platform and NetQ Agents. If you want the NetQ Agent to communicate with the NetQ Platform via a different port, you need to specify the port number when configuring the NetQ Agent like this:

    root@rhel7:~# sudo netq config add agent server 192.168.1.254 port 7379
    root@rhel7:~# sudo netq config restart agent
    

    Install NetQ CLI

    When installing NetQ 3.0.x, it is not required that you install the NetQ CLI on your NetQ Appliances or VMs, or monitored switches and hosts, but it provides new features, important bug fixes, and the ability to manage your network from multiple points in the network.

    Use the instructions in the following sections based on the OS installed on the switch or server.

    Install and Configure the NetQ CLI on Cumulus Linux Switches

    After installing your Cumulus NetQ software and the NetQ 3.0.0 Agent on each switch you want to monitor, you can also install the NetQ CLI on switches running:

    Install the NetQ CLI on a Cumulus Linux Switch

    A simple process installs the NetQ CLI on a Cumulus Linux switch.

    To install the NetQ CLI you need to install netq-apps on each switch. This is available from the Cumulus Networks repository.

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the Cumulus Networks repository.

    To obtain the NetQ Agent package:

    Edit the /etc/apt/sources.list file to add the repository for Cumulus NetQ.

    Note that NetQ has a separate repository from Cumulus Linux.

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-3.0
    ...
    

    The repository deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest can be used if you want to always retrieve the latest posted version of NetQ.

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-3.0
    ...
    

    The repository deb http://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest can be used if you want to always retrieve the latest posted version of NetQ.

    1. Update the local apt repository and install the software on the switch.

      cumulus@switch:~$ sudo apt-get update
      cumulus@switch:~$ sudo apt-get install netq-apps
      
    2. Verify you have the correct version of the CLI.

      cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • Cumulus Linux 3.3.2-3.7.x
        • netq-apps_3.0.0-cl3u27~1588048439.0e20d33_armel.deb
        • netq-apps_3.0.0-cl3u27~1588242701.9fb5b87_amd64.deb
      • Cumulus Linux 4.0.0 and later
        • netq-apps_3.0.0-cl4u27~1588048918.0e20d335_armel.deb
        • netq-apps_3.0.0-cl4u27~1588242814.9fb5b87d_amd64.deb
    3. Continue with NetQ CLI configuration in the next section.

    Configure the NetQ CLI on a Cumulus Linux Switch

    Two methods are available for configuring the NetQ CLI on a switch:

    Configure NetQ CLI Using the CLI

    The steps to configure the CLI are different depending on whether the NetQ software has been installed for an on-premises or cloud deployment. Follow the instructions for your deployment type.

    Use the following command to configure the CLI:

    netq config add cli server <text-gateway-dest> [vrf <text-vrf-name>] [port <text-gateway-port>]
    

    Restart the CLI afterward to activate the configuration.

    This example uses an IP address of 192.168.1.0 and the default port and VRF.

    cumulus@switch:~$ sudo netq config add cli server 192.168.1.0
    cumulus@switch:~$ sudo netq config restart cli
    

    If you have a server cluster deployed, use the IP address of the master server.

    To access and configure the CLI on your NetQ Cloud Appliance or VM, you must have your username and password to access the NetQ UI to generate AuthKeys. These keys provide authorized access (access key) and user authentication (secret key). Your credentials and NetQ Cloud addresses were provided by Cumulus Networks via an email titled Welcome to Cumulus NetQ!

    To generate AuthKeys:

    1. In your Internet browser, enter netq.cumulusnetworks.com into the address field to open the NetQ UI login page.

    2. Enter your username and password.

    3. Click (Main Menu), select Management in the Admin column.

    1. Click Manage on the User Accounts card.

    2. Select your user and click above the table.

    3. Copy these keys to a safe place.

    The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.

    You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:

    • store the file wherever you like, for example in /home/cumulus/ or /etc/netq
    • name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml

    BUT, the file must have the following format:

    access-key: <user-access-key-value-here>
    secret-key: <user-secret-key-value-here>
    

    1. Now that you have your AuthKeys, use the following command to configure the CLI:

      netq config add cli server <text-gateway-dest> [access-key <text-access-key> secret-key <text-secret-key> premises <text-premises-name> | cli-keys-file <text-key-file> premises <text-premises-name>] [vrf <text-vrf-name>] [port <text-gateway-port>]
      
    2. Restart the CLI afterward to activate the configuration.

      This example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Be sure to replace the key values with your generated keys if you are using this example on your server.

      cumulus@switch:~$ sudo netq config add cli server api.netq.cumulusnetworks.com access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      cumulus@switch:~$ sudo netq config restart cli
      Restarting NetQ CLI... Success!
      

      This example uses an optional keys file. Be sure to replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.

      cumulus@switch:~$ sudo netq config add cli server api.netq.cumulusnetworks.com cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      cumulus@switch:~$ netq config restart cli
      Restarting NetQ CLI... Success!
      

    If you have multiple premises and want to query data from a different premises than you originally configured, rerun the netq config add cli server command with the desired premises name. You can only view the data for one premises at a time with the CLI.

    Configure NetQ CLI Using a Configuration File

    You can configure the NetQ CLI in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:

      cumulus@switch:~$ sudo nano /etc/netq/netq.yml
      
    2. Locate the netq-cli section, or add it.

    3. Set the parameters for the CLI.

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: IP address of the NetQ server or NetQ Appliance
      • port (default): 32708

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 32708
      server: 192.168.0.254
      

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: api.netq.cumulusnetworks.com
      • port (default): 443
      • premises: Name of premises you want to query

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 443
      premises: datacenterwest
      server: api.netq.cumulusnetworks.com
      

    Install and Configure the NetQ CLI on Ubuntu Servers

    After installing your Cumulus NetQ software, you should install the NetQ 3.0.0 Agents on each switch you want to monitor. NetQ Agents can be installed on servers running:

    Prepare for NetQ CLI Installation on an Ubuntu Server

    For servers running Ubuntu OS, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the Cumulus Networks repository.

    Verify Service Package Versions

    Before you install the NetQ Agent on an Ubuntu server, make sure the following packages are installed and running these minimum versions:

    Verify the Server is Running lldpd

    Make sure you are running lldpd, not lldpad. Ubuntu does not include lldpd by default, which is required for the installation.

    To install this package, run the following commands:

    root@ubuntu:~# sudo apt-get update
    root@ubuntu:~# sudo apt-get install lldpd
    root@ubuntu:~# sudo systemctl enable lldpd.service
    root@ubuntu:~# sudo systemctl start lldpd.service
    

    Install and Configure Network Time Server

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

      root@ubuntu:~# sudo apt-get install ntp
      
    2. Configure the network time server.

      1. Open the /etc/ntp.conf file in your text editor of choice.

      2. Under the Server section, specify the NTP server IP address or hostname.

      3. Enable and start the NTP service.

        root@ubuntu:~# sudo systemctl enable ntp
        root@ubuntu:~# sudo systemctl start ntp
        

      If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

      1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock is synchronized.

        root@ubuntu:~# ntpq -pn
        remote           refid            st t when poll reach   delay   offset  jitter
        ==============================================================================
        +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
        +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
        2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
        \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
        
        
      1. Install chrony if needed.

        root@ubuntu:~# sudo apt install chrony
        
      2. Start the chrony service.

        root@ubuntu:~# sudo /usr/local/sbin/chronyd
        
      3. Verify it installed successfully.

        root@ubuntu:~# chronyc activity
        200 OK
        8 sources online
        0 sources offline
        0 sources doing burst (return to online)
        0 sources doing burst (return to offline)
        0 sources with unknown address
        
      4. View the time servers chrony is using.

        root@ubuntu:~# chronyc sources
        210 Number of sources = 8
        
        MS Name/IP address         Stratum Poll Reach LastRx Last sample
        ===============================================================================
        ^+ golem.canonical.com           2   6   377    39  -1135us[-1135us] +/-   98ms
        ^* clock.xmission.com            2   6   377    41  -4641ns[ +144us] +/-   41ms
        ^+ ntp.ubuntu.net              2   7   377   106   -746us[ -573us] +/-   41ms
        ...
        

        Open the chrony.conf configuration file (by default at /etc/chrony/) and edit if needed.

        Example with individual servers specified:

        server golem.canonical.com iburst
        server clock.xmission.com iburst
        server ntp.ubuntu.com iburst
        driftfile /var/lib/chrony/drift
        makestep 1.0 3
        rtcsync
        

        Example when using a pool of servers:

        pool pool.ntp.org iburst
        driftfile /var/lib/chrony/drift
        makestep 1.0 3
        rtcsync
        
      5. View the server chrony is currently tracking.

        root@ubuntu:~# chronyc tracking
        Reference ID    : 5BBD59C7 (golem.canonical.com)
        Stratum         : 3
        Ref time (UTC)  : Mon Feb 10 14:35:18 2020
        System time     : 0.0000046340 seconds slow of NTP time
        Last offset     : -0.000123459 seconds
        RMS offset      : 0.007654410 seconds
        Frequency       : 8.342 ppm slow
        Residual freq   : -0.000 ppm
        Skew            : 26.846 ppm
        Root delay      : 0.031207654 seconds
        Root dispersion : 0.001234590 seconds
        Update interval : 115.2 seconds
        Leap status     : Normal
        

    Obtain NetQ CLI Software Package

    To install the NetQ Agent you need to install netq-apps on each server. This is available from the Cumulus Networks repository.

    To obtain the NetQ CLI package:

    1. Reference and update the local apt repository.

      root@ubuntu:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | apt-key add -
      
    2. Add the Ubuntu repository:

      Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-xenial.list and add the following line:

      root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-xenial.list
      ...
      deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb xenial netq-latest
      ...
      

      Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:

      root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
      ...
      deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
      ...
      

      The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even in the case where a major version update has been made. If you want to keep the repository on a specific version - such as netq-2.4 - use that instead.

    Install NetQ CLI on an Ubuntu Server

    A simple process installs the NetQ CLI on an Ubuntu server.

    1. Install the CLI software on the server.

      root@ubuntu:~# sudo apt-get update
      root@ubuntu:~# sudo apt-get install netq-apps
      
    2. Verify you have the correct version of the CLI.

      root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • netq-apps_3.0.0-ub18.04u27~1588242914.9fb5b87_amd64.deb
      • netq-apps_3.0.0-ub16.04u27~1588242914.9fb5b87_amd64.deb
    3. Continue with NetQ CLI configuration in the next section.

    Configure the NetQ CLI on an Ubuntu Server

    Two methods are available for configuring the NetQ CLI on a switch:

    Configure NetQ CLI Using the CLI

    The steps to configure the CLI are different depending on whether the NetQ software has been installed for an on-premises or cloud deployment. Follow the instruction for your deployment type.

    Use the following command to configure the CLI:

    netq config add cli server <text-gateway-dest> [vrf <text-vrf-name>] [port <text-gateway-port>]
    

    Restart the CLI afterward to activate the configuration.

    This example uses an IP address of 192.168.1.0 and the default port and VRF.

    root@ubuntu:~# sudo netq config add cli server 192.168.1.0
    root@ubuntu:~# sudo netq config restart cli
    

    If you have a server cluster deployed, use the IP address of the master server.

    To access and configure the CLI on your NetQ Platform or NetQ Cloud Appliance, you must have your username and password to access the NetQ UI to generate AuthKeys. These keys provide authorized access (access key) and user authentication (secret key). Your credentials and NetQ Cloud addresses were provided by Cumulus Networks via an email titled Welcome to Cumulus NetQ!

    To generate AuthKeys:

    1. In your Internet browser, enter netq.cumulusnetworks.com into the address field to open the NetQ UI login page.

    2. Enter your username and password.

    3. From the Main Menu, select Management in the Admin column.

    1. Click Manage on the User Accounts card.

    2. Select your user and click above the table.

    3. Copy these keys to a safe place.

    The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.

    You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:

    • store the file wherever you like, for example in /home/cumulus/ or /etc/netq
    • name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml

    BUT, the file must have the following format:

    access-key: <user-access-key-value-here>
    secret-key: <user-secret-key-value-here>
    

    1. Now that you have your AuthKeys, use the following command to configure the CLI:

      netq config add cli server <text-gateway-dest> [access-key <text-access-key> secret-key <text-secret-key> premises <text-premises-name> | cli-keys-file <text-key-file> premises <text-premises-name>] [vrf <text-vrf-name>] [port <text-gateway-port>]
      
    2. Restart the CLI afterward to activate the configuration.

      This example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Be sure to replace the key values with your generated keys if you are using this example on your server.

      root@ubuntu:~# sudo netq config add cli server api.netq.cumulusnetworks.com access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      root@ubuntu:~# sudo netq config restart cli
      Restarting NetQ CLI... Success!
      

      This example uses an optional keys file. Be sure to replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.

      root@ubuntu:~# sudo netq config add cli server api.netq.cumulusnetworks.com cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      root@ubuntu:~# sudo netq config restart cli
      Restarting NetQ CLI... Success!
      

    Rerun this command if you have multiple premises and want to query a different premises.

    Configure NetQ CLI Using Configuration File

    You can configure the NetQ CLI in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:

      root@ubuntu:~# sudo nano /etc/netq/netq.yml
      
    2. Locate the netq-cli section, or add it.

    3. Set the parameters for the CLI.

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: IP address of the NetQ server or NetQ Appliance
      • port (default): 32708

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 32708
      server: 192.168.0.254
      

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: api.netq.cumulusnetworks.com
      • port (default): 443
      • premises: Name of premises you want to query

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 443
      premises: datacenterwest
      server: api.netq.cumulusnetworks.com
      

    Install and Configure the NetQ CLI on RHEL and CentOS Servers

    After installing your Cumulus NetQ software and the NetQ 3.0.0 Agents on each switch you want to monitor, you can also install the NetQ CLI on servers running:

    Prepare for NetQ CLI Installation on a RHEL or CentOS Server

    For servers running RHEL or CentOS, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the Cumulus Networks repository.

    Verify Service Package Versions

    Before you install the NetQ CLI on a Red Hat or CentOS server, make sure the following packages are installed and running these minimum versions:

    Verify the Server is Running lldpd and wget

    Make sure you are running lldpd, not lldpad. CentOS does not include lldpd by default, nor does it include wget, which is required for the installation.

    To install this package, run the following commands:

    root@rhel7:~# sudo yum -y install epel-release
    root@rhel7:~# sudo yum -y install lldpd
    root@rhel7:~# sudo systemctl enable lldpd.service
    root@rhel7:~# sudo systemctl start lldpd.service
    root@rhel7:~# sudo yum install wget
    

    Install and Configure NTP

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server. Servers must be in time synchronization with the NetQ Appliance or VM to enable useful statistical analysis.

      root@rhel7:~# sudo yum install ntp
      
    2. Configure the NTP server.

      1. Open the /etc/ntp.conf file in your text editor of choice.

      2. Under the Server section, specify the NTP server IP address or hostname.

    3. Enable and start the NTP service.

      root@rhel7:~# sudo systemctl enable ntp
      root@rhel7:~# sudo systemctl start ntp
      

      If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    4. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock is synchronized.

      root@rhel7:~# ntpq -pn
      remote           refid            st t when poll reach   delay   offset  jitter
      ==============================================================================
      +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
      +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
      2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
      \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
      

    Install NetQ CLI on a RHEL or CentOS Server

    A simple process installs the NetQ CLI on a RHEL or CentOS server.

    1. Reference and update the local yum repository and key.

      root@rhel7:~# rpm --import https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm.pubkey
      root@rhel7:~# wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm-el7.repo > /etc/yum.repos.d/cumulus-host-el.repo
      
    2. Edit /etc/yum.repos.d/cumulus-host-el.repo to set the enabled=1 flag for the two NetQ repositories.

      root@rhel7:~# vi /etc/yum.repos.d/cumulus-host-el.repo
      ...
      [cumulus-arch-netq-3.0]
      name=Cumulus netq packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-3.0/$basearch
      gpgcheck=1
      enabled=1
      [cumulus-noarch-netq-3.0]
      name=Cumulus netq architecture-independent packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-3.0/noarch
      gpgcheck=1
      enabled=1
      ...
      
    3. Install the Bash completion and CLI software on the server.

      root@rhel7:~# sudo yum -y install bash-completion
      root@rhel7:~# sudo yum install netq-apps
      
    4. Verify you have the correct version of the CLI.

      root@rhel7:~# rpm -q -netq-apps
      
      You should see version 3.0.0 and update 27 or later in the results. For example:
      • netq-apps_3.0.0-rh7u27~1588244328.9fb5b87.x86_64.rpm
    5. Continue with the next section.

    Configure the NetQ CLI on a RHEL or CentOS Server

    Two methods are available for configuring the NetQ CLI on a switch:

    Configure NetQ CLI Using the CLI

    The steps to configure the CLI are different depending on whether the NetQ software has been installed for an on-premises or cloud deployment. Follow the instructions for your deployment type.

    Use the following command to configure the CLI:

    netq config add cli server <text-gateway-dest> [vrf <text-vrf-name>] [port <text-gateway-port>]
    

    Restart the CLI afterward to activate the configuration.

    This example uses an IP address of 192.168.1.0 and the default port and VRF.

    root@rhel7:~# sudo netq config add cli server 192.168.1.0
    root@rhel7:~# sudo netq config restart cli
    

    If you have a server cluster deployed, use the IP address of the master server.

    To access and configure the CLI on your NetQ Platform or NetQ Cloud Appliance, you must have your username and password to access the NetQ UI to generate AuthKeys. These keys provide authorized access (access key) and user authentication (secret key). Your credentials and NetQ Cloud addresses were provided by Cumulus Networks via an email titled Welcome to Cumulus NetQ!

    To generate AuthKeys:

    1. In your Internet browser, enter netq.cumulusnetworks.com into the address field to open the NetQ UI login page.

    2. Enter your username and password.

    3. From the Main Menu, select Management in the Admin column.

    1. Click Manage on the User Accounts card.

    2. Select your user and click above the table.

    3. Copy these keys to a safe place.

    The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.

    You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:

    • store the file wherever you like, for example in /home/cumulus/ or /etc/netq
    • name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml

    BUT, the file must have the following format:

    access-key: <user-access-key-value-here>
    secret-key: <user-secret-key-value-here>
    

    1. Now that you have your AuthKeys, use the following command to configure the CLI:

      netq config add cli server <text-gateway-dest> [access-key <text-access-key> secret-key <text-secret-key> premises <text-premises-name> | cli-keys-file <text-key-file> premises <text-premises-name>] [vrf <text-vrf-name>] [port <text-gateway-port>]
      
    2. Restart the CLI afterward to activate the configuration.

      This example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Be sure to replace the key values with your generated keys if you are using this example on your server.

      root@rhel7:~# sudo netq config add cli server api.netq.cumulusnetworks.com access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      root@rhel7:~# sudo netq config restart cli
      Restarting NetQ CLI... Success!
      

      This example uses an optional keys file. Be sure to replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.

      root@rhel7:~# sudo netq config add cli server api.netq.cumulusnetworks.com cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
      Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
      Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
      
      root@rhel7:~# sudo netq config restart cli
      Restarting NetQ CLI... Success!
      

    Rerun this command if you have multiple premises and want to query a different premises.

    Configure NetQ CLI Using Configuration File

    You can configure the NetQ CLI in the netq.yml configuration file contained in the /etc/netq/ directory.

    1. Open the netq.yml file using your text editor of choice. For example:

      root@rhel7:~# sudo nano /etc/netq/netq.yml
      
    2. Locate the netq-cli section, or add it.

    3. Set the parameters for the CLI.

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: IP address of the NetQ server or NetQ Appliance
      • port (default): 32708

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 32708
      server: 192.168.0.254
      

      Specify the following parameters:

      • netq-user: User who can access the CLI
      • server: api.netq.cumulusnetworks.com
      • port (default): 443
      • premises: Name of premises you want to query

      Your YAML configuration file should be similar to this:
      netq-cli:
      netq-user: admin@company.com
      port: 443
      premises: datacenterwest
      server: api.netq.cumulusnetworks.com
      

    Remove the NetQ Agent and CLI

    If you need to remove the NetQ agent and/or the NetQ CLI from a Cumulus Linux switch or Linux host, follow the steps below.

    Remove the Agent and CLI from a Cumulus Linux Switch or Ubuntu Host

    Use the apt-get purge command to remove the NetQ agent or CLI package from a Cumulus Linux switch or an Ubuntu host.

    cumulus@switch:~$ sudo apt-get update
    cumulus@switch:~$ sudo apt-get purge netq-agent netq-apps
    Reading package lists... Done
    Building dependency tree
    Reading state information... Done
    The following packages will be REMOVED:
      netq-agent* netq-apps*
    0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
    After this operation, 310 MB disk space will be freed.
    Do you want to continue? [Y/n] Y
    Creating pre-apt snapshot... 2 done.
    (Reading database ... 42026 files and directories currently installed.)
    Removing netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
    /usr/sbin/policy-rc.d returned 101, not running 'stop netq-agent.service'
    Purging configuration files for netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
    dpkg: warning: while removing netq-agent, directory '/etc/netq/config.d' not empty so not removed
    Removing netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
    /usr/sbin/policy-rc.d returned 101, not running 'stop netqd.service'
    Purging configuration files for netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
    dpkg: warning: while removing netq-apps, directory '/etc/netq' not empty so not removed
    Processing triggers for man-db (2.7.0.2-5) ...
    grep: extra.services.enabled: No such file or directory
    Creating post-apt snapshot... 3 done.
    

    If you only want to remove the agent or the CLI, but not both, specify just the relevant package in the apt-get purge command.

    To verify the packages have been removed from the switch, run:

    cumulus@switch:~$ dpkg-query -l netq-agent
    dpkg-query: no packages found matching netq-agent
    cumulus@switch:~$ dpkg-query -l netq-apps
    dpkg-query: no packages found matching netq-apps
    

    Remove the Agent and CLI from a RHEL7 or CentOS Host

    Use the yum remove command to remove the NetQ agent or CLI package from a RHEL7 or CentOS host.

    root@rhel7:~# sudo yum remove netq-agent netq-apps
    Loaded plugins: fastestmirror
    Resolving Dependencies
    --> Running transaction check
    ---> Package netq-agent.x86_64 0:3.0.0-rh7u27~1588050478.0e20d33 will be erased
    --> Processing Dependency: netq-agent >= 3.0.0 for package: cumulus-netq-3.0.0-rh7u27~1588054943.10fa7f6.x86_64
    --> Running transaction check
    ---> Package cumulus-netq.x86_64 0:3.0.0-rh7u27~1588054943.10fa7f6 will be erased
    --> Finished Dependency Resolution
    
    Dependencies Resolved
    
    ...
    
    Removed:
      netq-agent.x86_64 0:3.0.0-rh7u27~1588050478.0e20d33
    
    Dependency Removed:
      cumulus-netq.x86_64 0:3.0.0-rh7u27~1588054943.10fa7f6
    
    Complete!
    
    

    If you only want to remove the agent or the CLI, but not both, specify just the relevant package in the yum remove command.

    To verify the packages have been removed from the switch, run:

    root@rhel7:~# rpm -q netq-agent
    package netq-agent is not installed
    root@rhel7:~# rpm -q netq-apps
    package netq-apps is not installed
    

    Upgrade NetQ

    This topic describes how to upgrade from your current NetQ 2.4.x installation to the NetQ 3.0.0 release to take advantage of new capabilities and bug fixes (refer to the release notes).

    You must upgrade your NetQ On-premises or Cloud Appliance(s) or Virtual Machines (VMs). While NetQ 2.x Agents are compatible with NetQ 3.0.0, upgrading NetQ Agents is always recommended. If you want access to new and updated commands, you can upgrade the CLI on your physical servers or VMs, and monitored switches and hosts as well.

    To complete the upgrade for either an on-premises or a cloud deployment:

    Upgrade NetQ Appliances and Virtual Machines

    The first step in upgrading your NetQ 2.4.x installation to NetQ 3.0.0 is to upgrade either the NetQ Platform software running on your NetQ On-premises Appliance(s) or VM(s), or the NetQ Collector software running on your NetQ Cloud Appliance(s) or VM(s).

    Prepare for Upgrade

    Two important steps are required to prepare for upgrade of your NetQ software:

    Optionally, you can choose to back up your NetQ Data before performing the upgrade.

    To complete the preparation:

    1. For on-premises deployments only, optionally back up your NetQ 2.4.x data. Refer to Back Up Your NetQ Data.

    2. Download the relevant software.

      1. Go to the MyMellanox downloads page page, and select NetQ from the Product list.

      2. Select 3.0 from the Version list, and then click 3.0.0 in the submenu.

      3. Select the relevant software from the HyperVisor/Platform list:

        If you are upgrading NetQ Platform software for a NetQ On-premises Appliance or VM, select Appliance to download the NetQ-3.0.0.tgz file. If you are upgrading NetQ Collector software for a NetQ Cloud Appliance or VM, select Appliance (Cloud) to download the NetQ-3.0.0-opta.tgz file.

      4. Scroll down and click Download on the relevant image card.

        You can ignore the note on the image card because, unlike during installation, you do not need to download the bootstrap file for an upgrade.

    3. Copy the file to the /mnt/installables/ directory on your appliance or VM.

    4. Update /etc/apt/sources.list.d/cumulus-netq.list to netq-3.0 as followed

      cat /etc/apt/sources.list.d/cumulus-netq.list
      deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-3.0
      
    5. Update the NetQ debian packages using the following three commands.

      cumulus@<hostname>:~$ sudo dpkg --remove --force-remove-reinstreq cumulus-netq netq-apps netq-agent 2>/dev/null
      [sudo] password for cumulus:
      (Reading database ... 71621 files and directories currently installed.)
      Removing netq-apps (2.4.1-ub18.04u26~1581351889.c5ec3e5) ...
      Removing netq-agent (2.4.1-ub18.04u26~1581351889.c5ec3e5) ...
      Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
      
      cumulus@<hostname>:~$ sudo apt-get update
      Get:1 http://apps3.cumulusnetworks.com/repos/deb bionic InRelease [13.8 kB]
      Get:2 http://apps3.cumulusnetworks.com/repos/deb bionic/netq-3.0 amd64 Packages [758 B]
      Hit:3 http://archive.ubuntu.com/ubuntu bionic InRelease
      Get:4 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
      Get:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
      ...
      Get:24 http://archive.ubuntu.com/ubuntu bionic-backports/universe Translation-en [1900 B]
      Fetched 4651 kB in 3s (1605 kB/s)
      Reading package lists... Done
      
      cumulus@<hostname>:~$ sudo apt-get install -y netq-agent netq-apps
      Reading package lists... Done
      Building dependency tree
      Reading state information... Done
      ...
      The following NEW packages will be installed:
      netq-agent netq-apps
      ...
      Fetched 39.8 MB in 3s (13.5 MB/s)
      ...
      Unpacking netq-agent (3.0.0-ub18.04u27~1588242914.9fb5b87) ...
      ...
      Unpacking netq-apps (3.0.0-ub18.04u27~1588242914.9fb5b87) ...
      Setting up netq-apps (3.0.0-ub18.04u27~1588242914.9fb5b87) ...
      Setting up netq-agent (3.0.0-ub18.04u27~1588242914.9fb5b87) ...
      Processing triggers for rsyslog (8.32.0-1ubuntu4) ...
      Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
      

    Now that you have all of the software components prepared, you can upgrade your NetQ On-premises Appliance or VM, or your NetQ Cloud Appliance or VM, using the NetQ Admin UI, in the next section. Alternately, you can upgrade using the CLI here: Upgrade Using the NetQ CLI.

    Upgrade Using the NetQ Admin UI

    Upgrading your NetQ On-premises or Cloud Appliance(s) or VMs is simple using the Admin UI.

    To upgrade your NetQ software:

    1. Upgrade the Admin UI application.

      cumulus@<hostname>:~$ netq bootstrap master upgrade /mnt/installables/NetQ-3.0.0.tgz
      2020-04-28 15:39:37.016710: master-node-installer: Extracting tarball /mnt/installables/NetQ-3.0.0.tgz
      2020-04-28 15:44:48.188658: master-node-installer: Upgrading NetQ Admin container
      2020-04-28 15:47:35.667579: master-node-installer: Removing old images
      -----------------------------------------------
      Successfully bootstrap-upgraded the master node
      
      netq bootstrap master upgrade /mnt/installables/NetQ-3.0.0-opta.tgz
      
    2. Open the Admin UI by entering http://<hostname-or-ipaddress>:8443 in your browser address field.

    3. Click Upgrade.

      On-premises deployment (cloud deployment only has Node and Pod cards)

      On-premises deployment (cloud deployment only has Node and Pod cards)

    4. Enter NetQ-3.0.0.tgz or NetQ-3.0.0-opta.tgz and click .

      The is only visible after you enter your tar file information.

    5. Monitor the progress. Click to monitor each step in the jobs.

      The following example is for an on-premises upgrade. The jobs for a cloud upgrade are slightly different.

    6. When it completes, click to be returned to the Health dashboard.

    7. You can verify that you are on the correct version by viewing what is listed under the Cumulus logo.

    Upgrade Using the NetQ CLI

    Upgrading your NetQ On-premises or Cloud Appliance(s) or VMs is simple using the NetQ CLI.

    To upgrade:

    1. Run the appropriate netq upgrade command.

      netq upgrade bundle /mnt/installables/NetQ-3.0.0.tgz
      
      netq upgrade bundle /mnt/installables/NetQ-3.0.0-opta.tgz
      
    2. After the upgrade is completed, confirm the upgrade was successful.

      cumulus@<hostname>:~$ cat /etc/app-release
      BOOTSTRAP_VERSION=3.0.0
      APPLIANCE_MANIFEST_HASH=d40ca38672
      APPLIANCE_VERSION=3.0.0
      

    Upgrade NetQ Agents

    Cumulus Networks strongly recommends that you upgrade your NetQ Agents when you install or upgrade to a new release. If you are using NetQ Agent 2.4.0 update 24 or earlier, you must upgrade to ensure proper operation.

    Upgrade NetQ Agents on Cumulus Linux Switches

    The following instructions are applicable to both Cumulus Linux 3.x and 4.x, and for both on-premises and cloud deployments.

    To upgrade the NetQ Agent:

    1. Log in to your switch or host.

    2. Update and install the new NetQ debian package.

      sudo apt-get update
      sudo apt-get install -y netq-agent
      
      sudo yum update
      sudo yum install netq-agent
      
    3. Restart the NetQ Agent.

      netq config restart agent
      

    Refer to Install and Configure the NetQ Agent on Cumulus Linux Switches to complete the upgrade.

    Upgrade NetQ Agents on Ubuntu Servers

    The following instructions are applicable to both NetQ Platform and NetQ Appliances running Ubuntu 16.04 or 18.04 in on-premises and cloud deployments.

    To upgrade the NetQ Agent:

    1. Log in to your NetQ Platform or Appliance.

    2. Update your NetQ repository.

    root@ubuntu:~# sudo apt-get update
    
    1. Install the agent software.
    root@ubuntu:~# sudo apt-get install -y netq-agent
    
    1. Restart the NetQ Agent.
    root@ubuntu:~# netq config restart agent
    

    Refer to Install and Configure the NetQ Agent on Ubuntu Servers to complete the upgrade.

    Upgrade NetQ Agents on RHEL or CentOS Servers

    The following instructions are applicable to both on-premises and cloud deployments.

    To upgrade the NetQ Agent:

    1. Log in to your NetQ Platform.

    2. Update your NetQ repository.

    root@rhel7:~# sudo yum update
    
    1. Install the agent software.
    root@rhel7:~# sudo yum install netq-agent
    
    1. Restart the NetQ Agent.
    root@rhel7:~# netq config restart agent
    

    Refer to Install and Configure the NetQ Agent on RHEL and CentOS Servers to complete the upgrade.

    Verify NetQ Agent Version

    You can verify the version of the agent software you have deployed as described in the following sections.

    For Switches Running Cumulus Linux 3.x or 4.x

    Run the following command to view the NetQ Agent version.

    cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
    
    You should see version 3.0.0 and update 27 or later in the results. For example:

    If you see an older version, refer to Upgrade NetQ Agents on Cumulus Linux Switches.

    For Servers Running Ubuntu 16.04 or 18.04

    Run the following command to view the NetQ Agent version.

    root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
    
    You should see version 3.0.0 and update 27 or later in the results. For example:

    If you see an older version, refer to Upgrade NetQ Agents on Ubuntu Servers.

    For Servers Running RHEL7 or CentOS

    Run the following command to view the NetQ Agent version.

    root@rhel7:~# rpm -q -netq-agent
    
    You should see version 3.0.0 and update 27 or later in the results. For example:

    If you see an older version, refer to Upgrade NetQ Agents on RHEL or CentOS Servers.

    Upgrade NetQ CLI

    While it is not required to upgrade the NetQ CLI on your monitored switches and hosts when you upgrade to NetQ 3.0.0, doing so gives you access to new features and important bug fixes. Refer to the release notes for details.

    To upgrade the NetQ CLI:

    1. Log in to your switch or host.

    2. Update and install the new NetQ debian package.

      sudo apt-get update
      sudo apt-get install -y netq-apps
      
      sudo yum update
      sudo yum install netq-apps
      
    3. Restart the CLI.

      netq config restart cli
      

    To complete the upgrade, refer to the relevant configuration topic:

    Back Up and Restore NetQ

    It is recommended that you back up your NetQ data according to your company policy. Typically this includes after key configuration changes and on a scheduled basis.

    These topics describe how to backup and also restore your NetQ data for NetQ On-premises Appliance and VMs.

    These procedures do not apply to your NetQ Cloud Appliance or VM. Data backup is handled automatically with the NetQ cloud service.

    Back Up Your NetQ Data

    NetQ data is stored in a Cassandra database. A backup is performed by running scripts provided with the software and located in the /usr/sbin directory. When a backup is performed, a single tar file is created. The file is stored on a local drive that you specify and is named netq_master_snapshot_<timestamp>.tar.gz. Currently, only one backup file is supported, and includes the entire set of data tables. It is replaced each time a new backup is created.

    If the rollback option is selected during the lifecycle management upgrade process (the default behavior), a backup is created automatically.

    To manually create a backup:

    1. If you are backing up data from NetQ 2.4.0 or earlier, or you upgraded from NetQ 2.4.0 to 2.4.1, obtain an updated backuprestore script. If you installed NetQ 2.4.1 as a fresh install, you can skip this step. Replace <version> in these commands with 2.4.1 or later release version.

      cumulus@switch:~$ tar -xvzf  /mnt/installables/NetQ-<version>.tgz  -C /tmp/ ./netq-deploy-<version>.tgz
      
      cumulus@switch:~$ tar -xvzf /tmp/netq-deploy-<version>.tgz   -C /usr/sbin/ --strip-components 1 --wildcards backuprestore/*.sh
      
    2. Run the backup script to create a backup file in /opt/<backup-directory> being sure to replace the backup-directory option with the name of the directory you want to use for the backup file.

      cumulus@switch:~$ ./backuprestore.sh --backup --localdir /opt/<backup-directory>
      

      You can abbreviate the backup and localdir options of this command to -b and -l to reduce typing. If the backup directory identified does not already exist, the script creates the directory during the backup process.

      This is a sample of what you see as the script is running:

      [Fri 26 Jul 2019 02:35:35 PM UTC] - Received Inputs for backup ...
      [Fri 26 Jul 2019 02:35:36 PM UTC] - Able to find cassandra pod: cassandra-0
      [Fri 26 Jul 2019 02:35:36 PM UTC] - Continuing with the procedure ...
      [Fri 26 Jul 2019 02:35:36 PM UTC] - Removing the stale backup directory from cassandra pod...
      [Fri 26 Jul 2019 02:35:36 PM UTC] - Able to successfully cleanup up /opt/backuprestore from cassandra pod ...
      [Fri 26 Jul 2019 02:35:36 PM UTC] - Copying the backup script to cassandra pod ....
      /opt/backuprestore/createbackup.sh: line 1: cript: command not found
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to exeute /opt/backuprestore/createbackup.sh script on cassandra pod
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Creating local directory:/tmp/backuprestore/ ...  
      Directory /tmp/backuprestore/ already exists..cleaning up
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to copy backup from cassandra pod  to local directory:/tmp/backuprestore/ ...
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Validate the presence of backup file in directory:/tmp/backuprestore/
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to find backup file:netq_master_snapshot_2019-07-26_14_35_37_UTC.tar.gz
      [Fri 26 Jul 2019 02:35:48 PM UTC] - Backup finished successfully!
      
    3. Verify the backup file has been created.

      cumulus@switch:~$ cd /opt/<backup-directory>
      cumulus@switch:~/opt/<backup-directory># ls
      netq_master_snapshot_2019-06-04_07_24_50_UTC.tar.gz
      

    To create a scheduled backup, add ./backuprestore.sh --backup --localdir /opt/<backup-directory> to an existing cron job, or create a new one.

    Restore Your NetQ Data

    You can restore NetQ data using the backup file you created in Back Up Your NetQ Data. You can restore your instance to the same NetQ Platform or NetQ Appliance or to a new platform or appliance. You do not need to stop the server where the backup file resides to perform the restoration, but logins to the NetQ UI will fail during the restoration process.The restore option of the backup script, copies the data from the backup file to the database, decompresses it, verifies the restoration, and starts all necessary services. You should not see any data loss as a result of a restore operation.

    To restore NetQ on the same hardware where the backup file resides:

    1. If you are restoring data from NetQ 2.4.0 or earlier, or you upgraded from NetQ 2.4.0 to 2.4.1, obtain an updated backuprestore script. If you installed NetQ 2.4.1 as a fresh install, you can skip this step. Replace <version> in these commands with 2.4.1 or later release version.

      cumulus@switch:~$ tar -xvzf  /mnt/installables/NetQ-<version>.tgz  -C /tmp/ ./netq-deploy-<version>.tgz
      
      cumulus@switch:~$ tar -xvzf /tmp/netq-deploy-<version>.tgz   -C /usr/sbin/ --strip-components 1 --wildcards backuprestore/*.sh
      
    2. Run the restore script being sure to replace the backup-directory option with the name of the directory where the backup file resides.

      cumulus@switch:~$ ./backuprestore.sh --restore --localdir /opt/<backup-directory>
      

      You can abbreviate the restore and localdir options of this command to -r and -l to reduce typing.

      This is a sample of what you see while the script is running:

      [Fri 26 Jul 2019 02:37:49 PM UTC] - Received Inputs for restore ...
      
      WARNING: Restore procedure wipes out the existing contents of Database.
        Once the Database is restored you loose the old data and cannot be recovered.
      "Do you like to continue with Database restore:[Y(yes)/N(no)]. (Default:N)"
      

      You must answer the above question to continue the restoration. After entering Y or yes, the output continues as follows:

      [Fri 26 Jul 2019 02:37:50 PM UTC] - Able to find cassandra pod: cassandra-0
      [Fri 26 Jul 2019 02:37:50 PM UTC] - Continuing with the procedure ...
      [Fri 26 Jul 2019 02:37:50 PM UTC] - Backup local directory:/tmp/backuprestore/ exists....
      [Fri 26 Jul 2019 02:37:50 PM UTC] - Removing any stale restore directories ...
      Copying the file for restore to cassandra pod ....
      [Fri 26 Jul 2019 02:37:50 PM UTC] - Able to copy the local directory contents to cassandra pod in /tmp/backuprestore/.
      [Fri 26 Jul 2019 02:37:50 PM UTC] - copying the script to cassandra pod in dir:/tmp/backuprestore/....
      Executing the Script for restoring the backup ...
      /tmp/backuprestore//createbackup.sh: line 1: cript: command not found
      [Fri 26 Jul 2019 02:40:12 PM UTC] - Able to exeute /tmp/backuprestore//createbackup.sh script on cassandra pod
      [Fri 26 Jul 2019 02:40:12 PM UTC] - Restore finished successfully!
      

    To restore NetQ on new hardware:

    1. Copy the backup file from /opt/<backup-directory> on the older hardware to the backup directory on the new hardware.

    2. Run the restore script on the new hardware, being sure to replace the backup-directory option with the name of the directory where the backup file resides.

      cumulus@switch:~$ ./backuprestore.sh --restore --localdir /opt/<backup-directory>
      

    Configuration Updates

    After installation or upgrade of NetQ is complete, there are a few additional configuration tasks that might be required.

    Add More Nodes to Your Server Cluster

    Installation of NetQ with a server cluster sets up the master and two worker nodes. To expand your cluster to include up to a total of nine worker nodes, use the Admin UI.

    To add more worker nodes:

    1. Prepare the nodes. Refer to the relevant server cluster instructions in Install NetQ System Platform.

    2. Open the Admin UI by entering https://<master-hostname-or-ipaddress>:8443 in your browser address field.

      This opens the Health dashboard for NetQ.

    3. Click Cluster to view your current configuration.

      On-premises deployment

      On-premises deployment

      This opens the Cluster dashboard, with the details about each node in the cluster.

    4. Click Add Worker Node.

    5. Enter the private IP address of the node you want to add.

    6. Click Add.

      Monitor the progress of the three jobs by clicking next to the jobs.

      On completion, a card for the new node is added to the Cluster dashboard.

      If the addition fails for any reason, download the log file by clicking , run netq bootstrap reset on this new worker node, and then try again.

    7. Repeat this process to add more worker nodes as needed.

    Update Your Cloud Activation Key

    The cloud activation key is the one used to access the Cloud services, not the authorization keys used for configuring the CLI. It is provided by Cumulus Networks when your premises is set up. It is called the config-key.

    There are occasions where you might want to update your cloud service activation key. For example, if you mistyped the key during installation and now your existing key does not work, or you received a new key for your premises from Cumulus Networks.

    Update the activation key using the Admin UI or NetQ CLI:

    1. Open the Admin UI by entering https://<master-hostname-or-ipaddress>:8443 in your browser address field.

    2. Click Settings.

    3. Click Activation.

    4. Click Edit.

    5. Enter your new configuration key in the designated text box.

    6. Click Apply.

    Run the following command on your standalone or master NetQ Cloud Appliance or VM replacing text-opta-key with your new key.

    cumulus@<hostname>:~$ netq install [standalone|cluster] activate-job config-key <text-opta-key>
    

    Cumulus NetQ Integration Guide

    After you have completed the installation of Cumulus NetQ, you may want to configure some of the additional capabilities that NetQ offers or integrate it with third-party software or hardware.

    This topic describes how to:

    Integrate NetQ with Notification Applications

    After you have installed the NetQ applications package and the NetQ Agents, you may want to configure some of the additional capabilities that NetQ offers. This topic describes how to integrate NetQ with an event notification application.

    Integrate NetQ with an Event Notification Application

    To take advantage of the numerous event messages generated and processed by NetQ, you must integrate with third-party event notification applications. You can integrate NetQ with Syslog, PagerDuty and Slack tools. You may integrate with one or more of these applications simultaneously.

    Each network protocol and service in the NetQ Platform receives the raw data stream from the NetQ Agents, processes the data and delivers events to the Notification function. Notification then stores, filters and sends messages to any configured notification applications. Filters are based on rules you create. You must have at least one rule per filter. A select set of events can be triggered by a user-configured threshold.

    You may choose to implement a proxy server (that sits between the NetQ Platform and the integration channels) that receives, processes and distributes the notifications rather than having them sent directly to the integration channel. If you use such a proxy, you must configure NetQ with the proxy information.

    In either case, notifications are generated for the following types of events:

    CategoryEvents
    Network Protocols
    • BGP status and session state
    • CLAG (MLAG) status and session state
    • EVPN status and session state
    • LLDP status
    • OSPF status and session state
    • VLAN status and session state *
    • VXLAN status and session state *
    Interfaces
    • Link status
    • Ports and cables status
    • MTU status
    Services
    • NetQ Agent status
    • PTM
    • SSH *
    • NTP status*
    Traces
    • On-demand trace status
    • Scheduled trace status
    Sensors
    • Fan status
    • PSU (power supply unit) status
    • Temperature status
    System Software
    • Configuration File changes
    • Running Configuration File changes
    • Cumulus Linux License status
    • Cumulus Linux Support status
    • Software Package status
    • Operating System version
    System Hardware
    • Physical resources status
    • BTRFS status
    • SSD utilization status
    • Threshold Crossing Alerts (TCAs)

    * This type of event can only be viewed in the CLI with this release.

    Refer to the Events Reference for descriptions and examples of these events.

    Event Message Format

    Messages have the following structure: <message-type><timestamp><opid><hostname><severity><message>

    ElementDescription
    message typeCategory of event; agent, bgp, clag, clsupport, configdiff, evpn, license, link, lldp, mtu, node, ntp, ospf, packageinfo, ptm, resource, runningconfigdiff, sensor, services, ssdutil, tca, trace, version, vlan or vxlan
    timestampDate and time event occurred
    opidIdentifier of the service or process that generated the event
    hostnameHostname of network device where event occurred
    severitySeverity level in which the given event is classified; debug, error, info, warning, or critical
    messageText description of event

    For example:

    To set up the integrations, you must configure NetQ with at least one channel, one rule, and one filter. To refine what messages you want to view and where to send them, you can add additional rules and filters and set thresholds on supported event types. You can also configure a proxy server to receive, process, and forward the messages. This is accomplished using the NetQ CLI in the following order:

    Notification Commands Overview

    The NetQ Command Line Interface (CLI) is used to filter and send notifications to third-party tools based on severity, service, event-type, and device. You can use TAB completion or the help option to assist when needed.

    The command syntax for standard events is:

    ##Channels
    netq add notification channel slack <text-channel-name> webhook <text-webhook-url> [severity info|severity warning|severity error|severity debug] [tag <text-slack-tag>]
    netq add notification channel pagerduty <text-channel-name> integration-key <text-integration-key> [severity info|severity warning|severity error|severity debug]
     
    ##Rules and Filters
    netq add notification rule <text-rule-name> key <text-rule-key> value <text-rule-value>
    netq add notification filter <text-filter-name> [severity info|severity warning|severity error|severity debug] [rule <text-rule-name-anchor>] [channel <text-channel-name-anchor>] [before <text-filter-name-anchor>|after <text-filter-name-anchor>]
     
    ##Management
    netq del notification channel <text-channel-name-anchor>
    netq del notification filter <text-filter-name-anchor>
    netq del notification rule <text-rule-name-anchor>
    netq show notification [channel|filter|rule] [json]
    

    The command syntax for events with user-configurable thresholds is:

    ##Rules
    netq add tca event_id <event-name> scope <regex-filter> [severity <critical|info>] threshold <value>
    
    ##Management
    netq add tca tca_id <tca-rule-name> is_active <true|false>
    netq add tca tca_id <tca-rule-name> channel drop <channel-name>
    netq del tca tca_id <tca-rule-name>
    netq show tca [tca_id <tca-rule-name>]
    

    The command syntax for a server proxy is:

    ##Proxy
    netq add notification proxy <text-proxy-hostname> [port <text-proxy-port>]
    netq show notification proxy
    netq del notification proxy
    

    The various command options are described in the following sections where they are used.

    Configure Basic NetQ Event Notification

    The simplest configuration you can create is one that sends all events generated by all interfaces to a single notification application. This is described here. For more granular configurations and examples, refer to Configure Advanced NetQ Event Notifications.

    A notification configuration must contain one channel, one rule, and one filter. Creation of the configuration follows this same path:

    1. Add a channel (slack, pagerduty, syslog)
    2. Add a rule that accepts all interface events
    3. Add a filter that associates this rule with the newly created channel

    Create Your Channel

    For Pager Duty:

    Configure a channel using the integration key for your Pager Duty setup. Verify the configuration.

    cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key c6d666e210a8425298ef7abde0d1998
    Successfully added/updated channel pd-netq-events
    
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity         Channel Info
    --------------- ---------------- ---------------- ------------------------
    pd-netq-events  pagerduty        info             integration-key: c6d666e
                                                    210a8425298ef7abde0d1998      
    

    For Slack:

    Create an incoming webhook as described in the documentation for your version of Slack. Verify the configuration.

    cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
    Successfully added/updated channel slk-netq-events
        
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity Channel Info
    --------------- ---------------- -------- ----------------------
    slk-netq-events slack            info     webhook:https://hooks.s
                                            lack.com/services/text/
                                            moretext/evenmoretext
    

    For Syslog:

    Create the channel using the syslog server hostname (or IP address) and port. Verify the configuration.

    cumulus@switch:~$ netq add notification channel syslog syslog-netq-events hostname syslog-server port 514
    Successfully added/updated channel syslog-netq-events
        
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity Channel Info
    --------------- ---------------- -------- ----------------------
    syslog-netq-eve syslog            info     host:syslog-server
    nts                                        port: 514
    

    Create a Rule

    Create and verify a rule that accepts all interface events. Verify the configuration.

    cumulus@switch:~$ netq add notification rule all-ifs key ifname value ALL
    Successfully added/updated rule all-ifs
    
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    all-interfaces  ifname           ALL
    

    Create a Filter

    Create a filter to tie the rule to the channel. Verify the configuration.

    For PagerDuty:

    cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-ifs channel pd-netq-events
    Successfully added/updated filter notify-all-ifs
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    notify-all-ifs  1          info             pd-netq-events   all-ifs
    

    For Slack:

    cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-ifs channel slk-netq-events
    Successfully added/updated filter notify-all-ifs
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    notify-all-ifs  1          info             slk-netq-events   all-ifs
    

    For Syslog:

    cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-ifs channel syslog-netq-events
    Successfully added/updated filter notify-all-ifs
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    notify-all-ifs  1          info             syslog-netq-events all-ifs
    

    NetQ is now configured to send all interface events to your selected channel.

    Configure Advanced NetQ Event Notifications

    If you want to create more granular notifications based on such items as selected devices, characteristics of devices, or protocols, or you want to use a proxy server, you need more than the basic notification configuration. Details for creating these more complex notification configurations are included here.

    Configure a Proxy Server

    To send notification messages through a proxy server instead of directly to a notification channel, you configure NetQ with the hostname and optionally a port of a proxy server. If no port is specified, NetQ defaults to port 80. Only one proxy server is currently supported. To simplify deployment, configure your proxy server before configuring channels, rules, or filters.To configure the proxy server:

    cumulus@switch:~$ netq add notification proxy <text-proxy-hostname> [port <text-proxy-port]
    cumulus@switch:~$ netq add notification proxy proxy4
    Successfully configured notifier proxy proxy4:80
    

    You can view the proxy server settings by running the netq show notification proxy command.

    cumulus@switch:~$ netq show notification proxy
    Matching config_notify records:
    Proxy URL          Slack Enabled              PagerDuty Enabled
    ------------------ -------------------------- ----------------------------------
    proxy4:80          yes                        yes
    

    You can remove the proxy server by running the netq del notification proxy command. This changes the NetQ behavior to send events directly to the notification channels.

    cumulus@switch:~$ netq del notification proxy
    Successfully overwrote notifier proxy to null
    

    Create Channels

    Create one or more PagerDuty, Slack, or syslog channels to present the notifications.

    Configure a PagerDuty Channel

    NetQ sends notifications to PagerDuty as PagerDuty events.

    For example:

    To configure the NetQ notifier to send notifications to PagerDuty:

    1. Configure the following options using the netq add notification channel command:

      OptionDescription
      CHANNEL_TYPE <text-channel-name>The third-party notification channel and name; use pagerduty in this case.
      integration-key <text-integration-key>The integration key is also called the service_key or routing_key. The default is an empty string ("").
      severity(Optional) The log level to set, which can be one of info, warning, error, critical or debug. The severity defaults to info.
      cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key c6d666e210a8425298ef7abde0d1998
      Successfully added/updated channel pd-netq-events
      
    2. Verify that the channel is configured properly.

      cumulus@switch:~$ netq show notification channel
      Matching config_notify records:
      Name            Type             Severity         Channel Info
      --------------- ---------------- ---------------- ------------------------
      pd-netq-events  pagerduty        info             integration-key: c6d666e
                                                        210a8425298ef7abde0d1998
      

    Configure a Slack Channel

    NetQ Notifier sends notifications to Slack as incoming webhooks for a Slack channel you configure. For example:

    To configure NetQ to send notifications to Slack:

    1. If needed, create one or more Slack channels on which to receive the notifications.

      1. Click + next to Channels.
      2. Enter a name for the channel, and click Create Channel.
      3. Navigate to the new channel.
      4. Click + Add an app link below the channel name to open the application directory.
      5. In the search box, start typing incoming and select ** Incoming WebHooks when it appears.
      6. Click Add Configuration and enter the name of the channel you created (where you want to post notifications).
      7. Click Add Incoming WebHooks integration.
      8. Save WebHook URL in a text file for use in next step.
    2. Configure the following options in the netq config add notification channel command:

      Option

      Description

      CHANNEL_TYPE <text-channel-name>

      The third-party notification channel name; use slack in this case.

      WEBHOOK

      Copy the WebHook URL from the text file OR in the desired channel, locate the initial message indicating the addition of the webhook, click incoming-webhook link, click Settings.

      Example URL: https://hooks.slack.com/services/text/moretext/evenmoretext

      severity

      The log level to set, which can be one of error, warning, info, or debug. The severity defaults to info.

      tag

      Optional tag appended to the Slack notification to highlight particular channels or people. The tag value must be preceded by the @ sign. For example, @netq-info.

      cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
      Successfully added/updated channel netq-events
      
    3. Verify the channel is configured correctly.
      From the CLI:

      cumulus@switch:~$ netq show notification channel
      Matching config_notify records:
      Name            Type             Severity Channel Info
      --------------- ---------------- -------- ----------------------
      slk-netq-events slack            info     webhook:https://hooks.s
                                                lack.com/services/text/
                                                moretext/evenmoretext
      

      From the Slack Channel:

    Create Rules

    Each rule is comprised of a single key-value pair. The key-value pair indicates what messages to include or drop from event information sent to a notification channel. You can create more than one rule for a single filter. Creating multiple rules for a given filter can provide a very defined filter. For example, you can specify rules around hostnames or interface names, enabling you to filter messages specific to those hosts or interfaces. You should have already defined the PagerDuty or Slack channels (as described earlier).

    There is a fixed set of valid rule keys. Values are entered as regular expressions and vary according to your deployment.

    ServiceRule KeyDescriptionExample Rule Values
    BGPmessage_typeNetwork protocol or service identifierbgp
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf11, exit01, spine-4
    peerUser-defined, text-based name for a peer switch or hostserver4, leaf-3, exit02, spine06
    descText description
    vrfName of VRF interfacemgmt, default
    old_statePrevious state of the BGP serviceEstablished, NotEstd
    new_stateCurrent state of the BGP serviceEstablished, NotEstd
    old_last_reset_timePrevious time that BGP service was resetApr3, 2019, 4:17 pm
    new_last_reset_timeMost recent time that BGP service was resetApr8, 2019, 11:38 am
    MLAG (CLAG)message_typeNetwork protocol or service identifierclag
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
    old_conflicted_bondsPrevious pair of interfaces in a conflicted bondswp7 swp8, swp3 swp4
    new_conflicted_bondsCurrent pair of interfaces in a conflicted bondswp11 swp12, swp23 swp24
    old_state_protodownbondPrevious state of the bondprotodown, up
    new_state_protodownbondCurrent state of the bondprotodown, up
    ConfigDiffmessage_typeNetwork protocol or service identifierconfigdiff
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf11, exit01, spine-4
    vniVirtual Network Instance identifier12, 23
    old_statePrevious state of the configuration filecreated, modified
    new_stateCurrent state of the configuration filecreated, modified
    EVPNmessage_typeNetwork protocol or service identifierevpn
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
    vniVirtual Network Instance identifier12, 23
    old_in_kernel_statePrevious VNI state, in kernel or nottrue, false
    new_in_kernel_stateCurrent VNI state, in kernel or nottrue, false
    old_adv_all_vni_statePrevious VNI advertising state, advertising all or nottrue, false
    new_adv_all_vni_stateCurrent VNI advertising state, advertising all or nottrue, false
    Linkmessage_typeNetwork protocol or service identifierlink
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf-6, exit01, spine7
    ifnameSoftware interface nameeth0, swp53
    LLDPmessage_typeNetwork protocol or service identifierlldp
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf41, exit01, spine-5, tor-36
    ifnameSoftware interface nameeth1, swp12
    old_peer_ifnamePrevious software interface nameeth1, swp12, swp27
    new_peer_ifnameCurrent software interface nameeth1, swp12, swp27
    old_peer_hostnamePrevious user-defined, text-based name for a peer switch or hostserver02, leaf41, exit01, spine-5, tor-36
    new_peer_hostnameCurrent user-defined, text-based name for a peer switch or hostserver02, leaf41, exit01, spine-5, tor-36
    Nodemessage_typeNetwork protocol or service identifiernode
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf41, exit01, spine-5, tor-36
    ntp_stateCurrent state of NTP servicein sync, not sync
    db_stateCurrent state of DBAdd, Update, Del, Dead
    NTPmessage_typeNetwork protocol or service identifierntp
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
    old_statePrevious state of servicein sync, not sync
    new_stateCurrent state of servicein sync, not sync
    Portmessage_typeNetwork protocol or service identifierport
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf13, exit01, spine-8, tor-36
    ifnameInterface nameeth0, swp14
    old_speedPrevious speed rating of port10 G, 25 G, 40 G, unknown
    old_transreceiverPrevious transceiver40G Base-CR4, 25G Base-CR
    old_vendor_namePrevious vendor name of installed port moduleAmphenol, OEM, Mellanox, Fiberstore, Finisar
    old_serial_numberPrevious serial number of installed port moduleMT1507VS05177, AVE1823402U, PTN1VH2
    old_supported_fecPrevious forward error correction (FEC) support statusnone, Base R, RS
    old_advertised_fecPrevious FEC advertising statetrue, false, not reported
    old_fecPrevious FEC capabilitynone
    old_autonegPrevious activation state of auto-negotiationon, off
    new_speedCurrent speed rating of port10 G, 25 G, 40 G
    new_transreceiverCurrent transceiver40G Base-CR4, 25G Base-CR
    new_vendor_nameCurrent vendor name of installed port moduleAmphenol, OEM, Mellanox, Fiberstore, Finisar
    new_part_numberCurrent part number of installed port moduleSFP-H10GB-CU1M, MC3309130-001, 603020003
    new_serial_numberCurrent serial number of installed port moduleMT1507VS05177, AVE1823402U, PTN1VH2
    new_supported_fecCurrent FEC support statusnone, Base R, RS
    new_advertised_fecCurrent FEC advertising statetrue, false
    new_fecCurrent FEC capabilitynone
    new_autonegCurrent activation state of auto-negotiationon, off
    SensorssensorNetwork protocol or service identifierFan: fan1, fan-2 Power Supply Unit: psu1, psu2 Temperature: psu1temp1, temp2
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf-26, exit01, spine2-4
    old_statePrevious state of a fan, power supply unit, or thermal sensorFan: ok, absent, bad PSU: ok, absent, bad Temp: ok, busted, bad, critical
    new_stateCurrent state of a fan, power supply unit, or thermal sensorFan: ok, absent, bad PSU: ok, absent, bad Temp: ok, busted, bad, critical
    old_s_statePrevious state of a fan or power supply unit.Fan: up, down PSU: up, down
    new_s_stateCurrent state of a fan or power supply unit.Fan: up, down PSU: up, down
    new_s_maxCurrent maximum temperature threshold valueTemp: 110
    new_s_critCurrent critical high temperature threshold valueTemp: 85
    new_s_lcritCurrent critical low temperature threshold valueTemp: -25
    new_s_minCurrent minimum temperature threshold valueTemp: -50
    Servicesmessage_typeNetwork protocol or service identifierservices
    hostnameUser-defined, text-based name for a switch or hostserver02, leaf03, exit01, spine-8
    nameName of serviceclagd, lldpd, ssh, ntp, netqd, net-agent
    old_pidPrevious process or service identifier12323, 52941
    new_pidCurrent process or service identifier12323, 52941
    old_statusPrevious status of serviceup, down
    new_statusCurrent status of serviceup, down

    Rule names are case sensitive, and no wildcards are permitted. Rule names may contain spaces, but must be enclosed with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use bgpSessionChanges or BGP-session-changes or BGPsessions, instead of 'BGP Session Changes'. Use Tab completion to view the command options syntax.

    Example Rules

    Create a BGP Rule Based on Hostname:

    cumulus@switch:~$ netq add notification rule bgpHostname key hostname value spine-01
    Successfully added/updated rule bgpHostname 
    

    Create a Rule Based on a Configuration File State Change:

    cumulus@switch:~$ netq add notification rule sysconf key configdiff value updated
    Successfully added/updated rule sysconf
    

    Create an EVPN Rule Based on a VNI:

    cumulus@switch:~$ netq add notification rule evpnVni key vni value 42
    Successfully added/updated rule evpnVni
    

    Create an Interface Rule Based on FEC Support:

    cumulus@switch:~$ netq add notification rule fecSupport key new_supported_fec value supported
    Successfully added/updated rule fecSupport
    

    Create a Service Rule Based on a Status Change:

    cumulus@switch:~$ netq add notification rule svcStatus key new_status value down
    Successfully added/updated rule svcStatus
    

    Create a Sensor Rule Based on a Threshold:

    cumulus@switch:~$ netq add notification rule overTemp key new_s_crit value 24
    Successfully added/updated rule overTemp
    

    Create an Interface Rule Based on Port:

    cumulus@switch:~$ netq add notification rule swp52 key port value swp52
    Successfully added/updated rule swp52 
    

    View the Rule Configurations

    Use the netq show notification command to view the rules on your platform.

    cumulus@switch:~$ netq show notification rule
     
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    fecSupport      new_supported_fe supported
                    c
    overTemp        new_s_crit       24
    svcStatus       new_status       down
    swp52           port             swp52
    sysconf         configdiff       updated
    

    Create Filters

    You can limit or direct event messages using filters. Filters are created based on rules you define; like those in the previous section. Each filter contains one or more rules. When a message matches the rule, it is sent to the indicated destination. Before you can create filters, you need to have already defined the rules and configured PagerDuty and/or Slack channels (as described earlier).

    As filters are created, they are added to the bottom of a filter list. By default, filters are processed in the order they appear in this list (from top to bottom) until a match is found. This means that each event message is first evaluated by the first filter listed, and if it matches then it is processed, ignoring all other filters, and the system moves on to the next event message received. If the event does not match the first filter, it is tested against the second filter, and if it matches then it is processed and the system moves on to the next event received. And so forth. Events that do not match any filter are ignored.

    You may need to change the order of filters in the list to ensure you capture the events you want and drop the events you do not want. This is possible using the before or after keywords to ensure one rule is processed before or after another.

    This diagram shows an example with four defined filters with sample output results.

    Filter names may contain spaces, but must be enclosed with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use bgpSessionChanges or BGP-session-changes or BGPsessions, instead of 'BGP Session Changes'. Filter names are also case sensitive.

    Example Filters

    Create a filter for BGP Events on a Particular Device:

    cumulus@switch:~$ netq add notification filter bgpSpine rule bgpHostname channel pd-netq-events
    Successfully added/updated filter bgpSpine
    

    Create a Filter for a Given VNI in Your EVPN Overlay:

    cumulus@switch:~$ netq add notification filter vni42 severity warning rule evpnVni channel pd-netq-events
    Successfully added/updated filter vni42
    

    Create a Filter for when a Configuration File has been Updated:

    cumulus@switch:~$ netq add notification filter configChange severity info rule sysconf channel slk-netq-events
    Successfully added/updated filter configChange
    

    Create a Filter to Monitor Ports with FEC Support:

    cumulus@switch:~$ netq add notification filter newFEC rule fecSupport channel slk-netq-events
    Successfully added/updated filter newFEC
    

    Create a Filter to Monitor for Services that Change to a Down State:

    cumulus@switch:~$ netq add notification filter svcDown severity error rule svcStatus channel slk-netq-events
    Successfully added/updated filter svcDown
    

    Create a Filter to Monitor Overheating Platforms:

    cumulus@switch:~$ netq add notification filter critTemp severity error rule overTemp channel pd-netq-events
    Successfully added/updated filter critTemp
    

    Create a Filter to Drop Messages from a Given Interface, and match against this filter before any other filters. To create a drop style filter, do not specify a channel. To put the filter first, use the before option.

    cumulus@switch:~$ netq add notification filter swp52Drop severity error rule swp52 before bgpSpine
    Successfully added/updated filter swp52Drop
    

    View the Filter Configurations

    Use the netq show notification command to view the filters on your platform.

    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    swp52Drop       1          error            NetqDefaultChann swp52
                                                el
    bgpSpine        2          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           3          warning          pd-netq-events   evpnVni
    configChange    4          info             slk-netq-events  sysconf
    newFEC          5          info             slk-netq-events  fecSupport
    svcDown         6          critical         slk-netq-events  svcStatus
    critTemp        7          critical         pd-netq-events   overTemp
    

    Reorder Filters

    When you look at the results of the netq show notification filter command above, you might notice that although you have the drop-based filter first (no point in looking at something you are going to drop anyway, so that is good), but the critical severity events are processed last, per the current definitions. If you wanted to process those before lesser severity events, you can reorder the list using the before and after options.

    For example, to put the two critical severity event filters just below the drop filter:

    cumulus@switch:~$ netq add notification filter critTemp after swp52Drop
    Successfully added/updated filter critTemp
    cumulus@switch:~$ netq add notification filter svcDown before bgpSpine
    Successfully added/updated filter svcDown
    

    You do not need to reenter all the severity, channel, and rule information for existing rules if you only want to change their processing order.

    Run the netq show notification command again to verify the changes:

    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    swp52Drop       1          error            NetqDefaultChann swp52
                                                el
    critTemp        2          critical         pd-netq-events   overTemp
    svcDown         3          critical         slk-netq-events  svcStatus
    bgpSpine        4          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           5          warning          pd-netq-events   evpnVni
    configChange    6          info             slk-netq-events  sysconf
    newFEC          7          info             slk-netq-events  fecSupport
    

    Examples of Advanced Notification Configurations

    Putting all of these channel, rule, and filter definitions together you create a complete notification configuration. The following are example notification configurations are created using the three-step process outlined above. Refer to Integrate NetQ with an Event Notification Application for details and instructions for creating channels, rules, and filters.

    Create a Notification for BGP Events from a Selected Switch

    In this example, we created a notification integration with a PagerDuty channel called pd-netq-events. We then created a rule bgpHostname and a filter called 4bgpSpine for any notifications from spine-01. The result is that any info severity event messages from Spine-01 are filtered to the pd-netq-events ** channel.

    cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
    Successfully added/updated channel pd-netq-events
    cumulus@switch:~$ netq add notification rule bgpHostname key node value spine-01
    Successfully added/updated rule bgpHostname
     
    cumulus@switch:~$ netq add notification filter bgpSpine rule bgpHostname channel pd-netq-events
    Successfully added/updated filter bgpSpine
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity         Channel Info
    --------------- ---------------- ---------------- ------------------------
    pd-netq-events  pagerduty        info             integration-key: 1234567
                                                      890   
    
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
     
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                 e
    

    Create a Notification for Warnings on a Given EVPN VNI

    In this example, we created a notification integration with a PagerDuty channel called pd-netq-events. We then created a rule evpnVni and a filter called 3vni42 for any warnings messages from VNI 42 on the EVPN overlay network. The result is that any warning severity event messages from VNI 42 are filtered to the pd-netq-events channel.

    cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
    Successfully added/updated channel pd-netq-events
     
    cumulus@switch:~$ netq add notification rule evpnVni key vni value 42
    Successfully added/updated rule evpnVni
     
    cumulus@switch:~$ netq add notification filter vni42 rule evpnVni channel pd-netq-events
    Successfully added/updated filter vni42
     
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity         Channel Info
    --------------- ---------------- ---------------- ------------------------
    pd-netq-events  pagerduty        info             integration-key: 1234567
                                                      890   
    
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
     
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           2          warning          pd-netq-events   evpnVni
    

    Create a Notification for Configuration File Changes

    In this example, we created a notification integration with a Slack channel called slk-netq-events. We then created a rule sysconf and a filter called configChange for any configuration file update messages. The result is that any configuration update messages are filtered to the slk-netq-events channel.

    cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
    Successfully added/updated channel slk-netq-events
     
    cumulus@switch:~$ netq add notification rule sysconf key configdiff value updated
    Successfully added/updated rule sysconf
     
    cumulus@switch:~$ netq add notification filter configChange severity info rule sysconf channel slk-netq-events
    Successfully added/updated filter configChange
     
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity Channel Info
    --------------- ---------------- -------- ----------------------
    slk-netq-events slack            info     webhook:https://hooks.s
                                              lack.com/services/text/
                                              moretext/evenmoretext     
     
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    sysconf         configdiff       updated
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           2          warning          pd-netq-events   evpnVni
    configChange    3          info             slk-netq-events  sysconf
    

    Create a Notification for When a Service Goes Down

    In this example, we created a notification integration with a Slack channel called slk-netq-events. We then created a rule svcStatus and a filter called svcDown for any services state messages indicating a service is no longer operational. The result is that any service down messages are filtered to the slk-netq-events channel.

    cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
    Successfully added/updated channel slk-netq-events
     
    cumulus@switch:~$ netq add notification rule svcStatus key new_status value down
    Successfully added/updated rule svcStatus
     
    cumulus@switch:~$ netq add notification filter svcDown severity error rule svcStatus channel slk-netq-events
    Successfully added/updated filter svcDown
     
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity Channel Info
    --------------- ---------------- -------- ----------------------
    slk-netq-events slack            info     webhook:https://hooks.s
                                              lack.com/services/text/
                                              moretext/evenmoretext     
     
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    svcStatus       new_status       down
    sysconf         configdiff       updated
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           2          warning          pd-netq-events   evpnVni
    configChange    3          info             slk-netq-events  sysconf
    svcDown         4          critical         slk-netq-events  svcStatus
    

    Create a Filter to Drop Notifications from a Given Interface

    In this example, we created a notification integration with a Slack channel called slk-netq-events. We then created a rule swp52 and a filter called swp52Drop that drops all notifications for events from interface swp52.

    cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
    Successfully added/updated channel slk-netq-events
     
    cumulus@switch:~$ netq add notification rule swp52 key port value swp52
    Successfully added/updated rule swp52
     
    cumulus@switch:~$ netq add notification filter swp52Drop severity error rule swp52 before bgpSpine
    Successfully added/updated filter swp52Drop
     
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity Channel Info
    --------------- ---------------- -------- ----------------------
    slk-netq-events slack            info     webhook:https://hooks.s
                                              lack.com/services/text/
                                              moretext/evenmoretext     
     
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    svcStatus       new_status       down
    swp52           port             swp52
    sysconf         configdiff       updated
    
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    swp52Drop       1          error            NetqDefaultChann swp52
                                                el
    bgpSpine        2          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           3          warning          pd-netq-events   evpnVni
    configChange    4          info             slk-netq-events  sysconf
    svcDown         5          critical         slk-netq-events  svcStatus
    

    Create a Notification for a Given Device that has a Tendency to Overheat (using multiple rules)

    In this example, we created a notification when switch leaf04 has passed over the high temperature threshold. Two rules were needed to create this notification, one to identify the specific device and one to identify the temperature trigger. We sent the message to the pd-netq-events channel.

    cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
    Successfully added/updated channel pd-netq-events
     
    cumulus@switch:~$ netq add notification rule switchLeaf04 key hostname value leaf04
    Successfully added/updated rule switchLeaf04
    cumulus@switch:~$ netq add notification rule overTemp key new_s_crit value 24
    Successfully added/updated rule overTemp
     
    cumulus@switch:~$ netq add notification filter critTemp rule switchLeaf04 channel pd-netq-events
    Successfully added/updated filter critTemp
    cumulus@switch:~$ netq add notification filter critTemp severity critical rule overTemp channel pd-netq-events
    Successfully added/updated filter critTemp
     
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity         Channel Info
    --------------- ---------------- ---------------- ------------------------
    pd-netq-events  pagerduty        info             integration-key: 1234567
                                                      890
    
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    overTemp        new_s_crit       24
    svcStatus       new_status       down
    switchLeaf04    hostname         leaf04
    swp52           port             swp52
    sysconf         configdiff       updated
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    swp52Drop       1          error            NetqDefaultChann swp52
                                                el
    bgpSpine        2          info             pd-netq-events   bgpHostnam
                                                                 e
    vni42           3          warning          pd-netq-events   evpnVni
    configChange    4          info             slk-netq-events  sysconf
    svcDown         5          critical         slk-netq-events  svcStatus
    critTemp        6          critical         pd-netq-events   switchLeaf
                                                                 04
                                                                 overTemp                                                
    

    View Notification Configurations in JSON Format

    You can view configured integrations using the netq show notification commands. To view the channels, filters, and rules, run the three flavors of the command. Include the json option to display JSON-formatted output.

    For example:

    cumulus@switch:~$ netq show notification channel json
    {
        "config_notify":[
            {
                "type":"slack",
                "name":"slk-netq-events",
                "channelInfo":"webhook:https://hooks.slack.com/services/text/moretext/evenmoretext",
                "severity":"info"
            },
            {
                "type":"pagerduty",
                "name":"pd-netq-events",
                "channelInfo":"integration-key: 1234567890",
                "severity":"info"
        }
        ],
        "truncatedResult":false
    }
     
    cumulus@switch:~$ netq show notification rule json
    {
        "config_notify":[
            {
                "ruleKey":"hostname",
                "ruleValue":"spine-01",
                "name":"bgpHostname"
            },
            {
                "ruleKey":"vni",
                "ruleValue":42,
                "name":"evpnVni"
            },
            {
                "ruleKey":"new_supported_fec",
                "ruleValue":"supported",
                "name":"fecSupport"
            },
            {
                "ruleKey":"new_s_crit",
                "ruleValue":24,
                "name":"overTemp"
            },
            {
                "ruleKey":"new_status",
                "ruleValue":"down",
                "name":"svcStatus"
            },
            {
                "ruleKey":"configdiff",
                "ruleValue":"updated",
                "name":"sysconf"
        }
        ],
        "truncatedResult":false
    }
     
    cumulus@switch:~$ netq show notification filter json
    {
        "config_notify":[
            {
                "channels":"pd-netq-events",
                "rules":"overTemp",
                "name":"1critTemp",
                "severity":"critical"
            },
            {
                "channels":"pd-netq-events",
                "rules":"evpnVni",
                "name":"3vni42",
                "severity":"warning"
            },
            {
                "channels":"pd-netq-events",
                "rules":"bgpHostname",
                "name":"4bgpSpine",
                "severity":"info"
            },
            {
                "channels":"slk-netq-events",
                "rules":"sysconf",
                "name":"configChange",
                "severity":"info"
            },
            {
                "channels":"slk-netq-events",
                "rules":"fecSupport",
                "name":"newFEC",
                "severity":"info"
            },
            {
                "channels":"slk-netq-events",
                "rules":"svcStatus",
                "name":"svcDown",
                "severity":"critical"
        }
        ],
        "truncatedResult":false
    }
    

    Manage NetQ Event Notification Integrations

    You might need to modify event notification configurations at some point in the lifecycle of your deployment.

    Remove an Event Notification Channel

    You can delete an event notification integration using the netq config del notification command. You can verify it has been removed using the related show command.

    For example, to remove a Slack integration and verify it is no longer in the configuration:

    cumulus@switch:~$ netq del notification channel slk-netq-events
    cumulus@switch:~$ netq show notification channel
    Matching config_notify records:
    Name            Type             Severity         Channel Info
    --------------- ---------------- ---------------- ------------------------
    pd-netq-events  pagerduty        info             integration-key: 1234567
                                                      890
    

    Delete an Event Notification Rule

    To delete a rule, use the following command, then verify it has been removed:

    cumulus@switch:~$ netq del notification rule swp52
    cumulus@switch:~$ netq show notification rule
    Matching config_notify records:
    Name            Rule Key         Rule Value
    --------------- ---------------- --------------------
    bgpHostname     hostname         spine-01
    evpnVni         vni              42
    overTemp        new_s_crit       24
    svcStatus       new_status       down
    switchLeaf04    hostname         leaf04
    sysconf         configdiff       updated
    

    Delete an Event Notification Filter

    To delete a filter, use the following command, then verify it has been removed:

    cumulus@switch:~$ netq del notification filter bgpSpine
    cumulus@switch:~$ netq show notification filter
    Matching config_notify records:
    Name            Order      Severity         Channels         Rules
    --------------- ---------- ---------------- ---------------- ----------
    swp52Drop       1          error            NetqDefaultChann swp52
                                                el
    vni42           2          warning          pd-netq-events   evpnVni
    configChange    3          info             slk-netq-events  sysconf
    svcDown         4          critical         slk-netq-events  svcStatus
    critTemp        5          critical         pd-netq-events   switchLeaf
                                                                 04
                                                                 overTemp
    

    Configure Threshold-based Event Notifications

    NetQ supports a set of events that are triggered by crossing a user-defined threshold, called TCA events. These events allow detection and prevention of network failures for selected interface, utilization, sensor, forwarding, and ACL events.

    The simplest configuration you can create is one that sends a TCA event generated by all devices and all interfaces to a single notification application. Use the netq add tca command to configure the event. Its syntax is:

    netq add tca [event_id <text-event-id-anchor>]  [scope <text-scope-anchor>] [tca_id <text-tca-id-anchor>]  [severity info | severity critical] [is_active true | is_active false] [suppress_until <text-suppress-ts>] [threshold <text-threshold-value> ] [channel <text-channel-name-anchor> | channel drop <text-drop-channel-name>]
    

    A notification configuration must contain one rule. Each rule must contain a scope and a threshold. Optionally, you can specify an associated channel. Note: If a rule is not associated with a channel, the event information is only reachable from the database. If you want to deliver events to one or more notification channels (syslog, Slack, or PagerDuty), create them by following the instructions in Create Your Channel, and then return here to define your rule.

    Supported Events

    The following events are supported:

    CategoryEvent IDDescription
    Interface StatisticsTCA_RXBROADCAST_UPPERrx_broadcast bytes per second on a given switch or host is greater than maximum threshold
    Interface StatisticsTCA_RXBYTES_UPPERrx_bytes per second on a given switch or host is greater than maximum threshold
    Interface StatisticsTCA_RXMULTICAST_UPPERrx_multicast per second on a given switch or host is greater than maximum threshold
    Interface StatisticsTCA_TXBROADCAST_UPPERtx_broadcast bytes per second on a given switch or host is greater than maximum threshold
    Interface StatisticsTCA_TXBYTES_UPPERtx_bytes per second on a given switch or host is greater than maximum threshold
    Interface StatisticsTCA_TXMULTICAST_UPPERtx_multicast bytes per second on a given switch or host is greater than maximum threshold
    Resource UtilizationTCA_CPU_UTILIZATION_UPPERCPU utilization (%) on a given switch or host is greater than maximum threshold
    Resource UtilizationTCA_DISK_UTILIZATION_UPPERDisk utilization (%) on a given switch or host is greater than maximum threshold
    Resource UtilizationTCA_MEMORY_UTILIZATION_UPPERMemory utilization (%) on a given switch or host is greater than maximum threshold
    SensorsTCA_SENSOR_FAN_UPPERSwitch sensor reported fan speed on a given switch or host is greater than maximum threshold
    SensorsTCA_SENSOR_POWER_UPPERSwitch sensor reported power (Watts) on a given switch or host is greater than maximum threshold
    SensorsTCA_SENSOR_TEMPERATURE_UPPERSwitch sensor reported temperature (°C) on a given switch or host is greater than maximum threshold
    SensorsTCA_SENSOR_VOLTAGE_UPPERSwitch sensor reported voltage (Volts) on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPERNumber of routes on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_TOTAL_MCAST_ROUTES_UPPERNumber of multicast routes on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_MAC_ENTRIES_UPPERNumber of MAC addresses on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_IPV4_ROUTE_UPPERNumber of IPv4 routes on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_IPV4_HOST_UPPERNumber of IPv4 hosts on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_IPV6_ROUTE_UPPERNumber of IPv6 hosts on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_IPV6_HOST_UPPERNumber of IPv6 hosts on a given switch or host is greater than maximum threshold
    Forwarding ResourcesTCA_TCAM_ECMP_NEXTHOPS_UPPERNumber of equal cost multi-path (ECMP) next hop entries on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_V4_FILTER_UPPERNumber of ingress ACL filters for IPv4 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_EG_ACL_V4_FILTER_UPPERNumber of egress ACL filters for IPv4 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_V4_MANGLE_UPPERNumber of ingress ACL mangles for IPv4 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_EG_ACL_V4_MANGLE_UPPERNumber of egress ACL mangles for IPv4 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_V6_FILTER_UPPERNumber of ingress ACL filters for IPv6 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_EG_ACL_V6_FILTER_UPPERNumber of egress ACL filters for IPv6 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_V6_MANGLE_UPPERNumber of ingress ACL mangles for IPv6 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_EG_ACL_V6_MANGLE_UPPERNumber of egress ACL mangles for IPv6 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_8021x_FILTER_UPPERNumber of ingress ACL 802.1 filters on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_ACL_L4_PORT_CHECKERS_UPPERNumber of ACL port range checkers on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_ACL_REGIONS_UPPERNumber of ACL regions on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_ACL_MIRROR_UPPERNumber of ingress ACL mirrors on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_ACL_18B_RULES_UPPERNumber of ACL 18B rules on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_ACL_32B_RULES_UPPERNumber of ACL 32B rules on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_ACL_54B_RULES_UPPERNumber of ACL 54B rules on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_PBR_V4_FILTER_UPPERNumber of ingress policy-based routing (PBR) filters for IPv4 addresses on a given switch or host is greater than maximum threshold
    ACL ResourcesTCA_TCAM_IN_PBR_V6_FILTER_UPPERNumber of ingress policy-based routing (PBR) filters for IPv6 addresses on a given switch or host is greater than maximum threshold

    Define a Scope

    A scope is used to filter the events generated by a given rule. Scope values are set on a per TCA rule basis. All rules can be filtered on Hostname. Some rules can also be filtered by other parameters, as shown in this table. Note: Scope parameters must be entered in the order defined.

    CategoryEvent IDScope Parameters
    Interface StatisticsTCA_RXBROADCAST_UPPERHostname, Interface
    Interface StatisticsTCA_RXBYTES_UPPERHostname, Interface
    Interface StatisticsTCA_RXMULTICAST_UPPERHostname, Interface
    Interface StatisticsTCA_TXBROADCAST_UPPERHostname, Interface
    Interface StatisticsTCA_TXBYTES_UPPERHostname, Interface
    Interface StatisticsTCA_TXMULTICAST_UPPERHostname, Interface
    Resource UtilizationTCA_CPU_UTILIZATION_UPPERHostname
    Resource UtilizationTCA_DISK_UTILIZATION_UPPERHostname
    Resource UtilizationTCA_MEMORY_UTILIZATION_UPPERHostname
    SensorsTCA_SENSOR_FAN_UPPERHostname, Sensor Name
    SensorsTCA_SENSOR_POWER_UPPERHostname, Sensor Name
    SensorsTCA_SENSOR_TEMPERATURE_UPPERHostname, Sensor Name
    SensorsTCA_SENSOR_VOLTAGE_UPPERHostname, Sensor Name
    Forwarding ResourcesTCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPERHostname
    Forwarding ResourcesTCA_TCAM_TOTAL_MCAST_ROUTES_UPPERHostname
    Forwarding ResourcesTCA_TCAM_MAC_ENTRIES_UPPERHostname
    Forwarding ResourcesTCA_TCAM_ECMP_NEXTHOPS_UPPERHostname
    Forwarding ResourcesTCA_TCAM_IPV4_ROUTE_UPPERHostname
    Forwarding ResourcesTCA_TCAM_IPV4_HOST_UPPERHostname
    Forwarding ResourcesTCA_TCAM_IPV6_ROUTE_UPPERHostname
    Forwarding ResourcesTCA_TCAM_IPV6_HOST_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_V4_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_EG_ACL_V4_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_V4_MANGLE_UPPERHostname
    ACL ResourcesTCA_TCAM_EG_ACL_V4_MANGLE_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_V6_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_EG_ACL_V6_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_V6_MANGLE_UPPERHostname
    ACL ResourcesTCA_TCAM_EG_ACL_V6_MANGLE_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_8021x_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_ACL_L4_PORT_CHECKERS_UPPERHostname
    ACL ResourcesTCA_TCAM_ACL_REGIONS_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_ACL_MIRROR_UPPERHostname
    ACL ResourcesTCA_TCAM_ACL_18B_RULES_UPPERHostname
    ACL ResourcesTCA_TCAM_ACL_32B_RULES_UPPERHostname
    ACL ResourcesTCA_TCAM_ACL_54B_RULES_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_PBR_V4_FILTER_UPPERHostname
    ACL ResourcesTCA_TCAM_IN_PBR_V6_FILTER_UPPERHostname

    Scopes are defined with regular expressions, as follows. When two paramaters are used, they are separated by a comma, but no space. When as asterisk (*) is used alone, it must be entered inside either single or double quotes. Single quotes are used here.

    ParametersScope ValueExampleResult
    Hostname<hostname>leaf01Deliver events for the specified device
    Hostname<partial-hostname>*leaf*Deliver events for devices with hostnames starting with specified text (leaf)
    Hostname'*''*'Deliver events for all devices
    Hostname, Interface<hostname>,<interface>leaf01,swp9Deliver events for the specified interface (swp9) on the specified device (leaf01)
    Hostname, Interface<hostname>,'*'leaf01,'*'Deliver events for all interfaces on the specified device (leaf01)
    Hostname, Interface'*',<interface>'*',swp9Deliver events for the specified interface (swp9) on all devices
    Hostname, Interface'*','*''*','*'Deliver events for all devices and all interfaces
    Hostname, Interface<partial-hostname>*,<interface>leaf*,swp9Deliver events for the specified interface (swp9) on all devices with hostnames starting with the specified text (leaf)
    Hostname, Interface<hostname>,<partial-interface>*leaf01,swp*Deliver events for all interface with names starting with the specified text (swp) on the specified device (leaf01)
    Hostname, Sensor Name<hostname>,<sensorname>leaf01,fan1Deliver events for the specified sensor (fan1) on the specified device (leaf01)
    Hostname, Sensor Name'*',<sensorname>'*',fan1Deliver events for the specified sensor (fan1) for all devices
    Hostname, Sensor Name<hostname>,'*'leaf01,'*'Deliver events for all sensors on the specified device (leaf01)
    Hostname, Sensor Name<partial-hostname>*,<interface>leaf*,fan1Deliver events for the specified sensor (fan1) on all devices with hostnames starting with the specified text (leaf)
    Hostname, Sensor Name<hostname>,<partial-sensorname>*leaf01,fan*Deliver events for all sensors with names starting with the specified text (fan) on the specified device (leaf01)
    Hostname, Sensor Name'*','*''*','*'Deliver events for all sensors on all devices

    Create a TCA Rule

    Now that you know which events are supported and how to set the scope, you can create a basic rule to deliver one of the TCA events to a notification channel using the netq add tca command. Note that the event ID is case sensitive and must be in all caps.

    For example, this rule tells NetQ to deliver an event notification to the tca_slack_ifstats pre-configured Slack channel when the CPU utilization exceeds 95% of its capacity on any monitored switch:

    netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' channel tca_slack_ifstats threshold 95
    

    This rule tells NetQ to deliver an event notification to the tca_pd_ifstats PagerDuty channel when the number of transmit bytes per second (Bps) on the leaf12 switch exceeds 20,000 Bps on any interface:

    netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' channel tca_pd_ifstats threshold 20000
    

    This rule tells NetQ to deliver an event notification to the syslog-netq syslog channel when the temperature on sensor temp1 on the leaf12 switch exceeds 32 degrees Celcius:

    netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf12,temp1 channel syslog-netq threshold 32
    

    For a Slack channel, the event messages should be similar to this:

    Set the Severity of a Threshold-based Event

    In addition to defining a scope for TCA rule, you can also set a severity of either info or critical. To add a severity to a rule, use the severity option.

    For example, if you want add a critical severity to the CPU utilization rule you created earlier:

    netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' severity critical channel tca_slack_resources threshold 95
    

    Or if an event is important, but not critical. Set the severity to info:

    netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' severity info channel tca_pd_ifstats threshold 20000
    

    Create Multiple Rules for a TCA Event

    You are likely to want more than one rule around a particular event. For example, you might want to:

    netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf*,temp1 channel syslog-netq threshold 32
    
    netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope '*',temp1 channel tca_sensors,tca_pd_sensors threshold 32
    
    netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf03,temp1 channel syslog-netq threshold 29
    

    Now you have four rules created (the original one, plus these three new ones) all based on the TCA_SENSOR_TEMPERATURE_UPPER event. To identify the various rules, NetQ automatically generates a TCA name for each rule. As each rule is created, an _# is added to the event name. The TCA Name for the first rule created is then TCA_SENSOR_TEMPERATURE_UPPER_1, the second rule created for this event is TCA_SENSOR_TEMPERATURE_UPPER_2, and so forth.

    Suppress a Rule

    During troubleshooting or maintenance of switches you may want to suppress a rule to prevent erroneous event messages. Using the suppress_until option allows you to prevent the rule from being applied for a designated amout of time (in seconds). When this time has passed, the rule is automatically reenabled.

    For example, to suppress the disk utilization event for an hour:

    cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 suppress_until 3600
    Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
    

    Remove a Channel from a Rule

    You can stop sending events to a particular channel using the drop option:

    cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 channel drop tca_slack_resources
    Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
    

    Manage Threshold-based Event Notifications

    Once you have created a bunch of rules, you might to manage them; view a list of the rules, disable a rule, delete a rule, and so forth.

    Show Threshold-based Event Rules

    You can view all TCA rules or a particular rule using the netq show tca command:

    Example 1: Display All TCA Rules

    cumulus@switch:~$ netq show tca
    Matching config_tca records:
    TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
    ---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
    TCA_CPU_UTILIZATION_UPPER_1  TCA_CPU_UTILIZATION_ {"hostname":"leaf01"}      critical         tca_slack_resource True   1                  Sun Dec  8 14:17:18 2019
                                 UPPER                                                            s
    TCA_DISK_UTILIZATION_UPPER_1 TCA_DISK_UTILIZATION {"hostname":"leaf01"}      info                                False  80                 Mon Dec  9 05:03:46 2019
                                 _UPPER
    TCA_MEMORY_UTILIZATION_UPPER TCA_MEMORY_UTILIZATI {"hostname":"leaf01"}      info             tca_slack_resource True   1                  Sun Dec  8 11:53:15 2019
    _1                           ON_UPPER                                                         s
    TCA_RXBYTES_UPPER_1          TCA_RXBYTES_UPPER    {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   100                Sun Dec  8 17:22:52 2019
                                                      ":"leaf01"}
    TCA_RXMULTICAST_UPPER_1      TCA_RXMULTICAST_UPPE {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   0                  Sun Dec  8 10:43:57 2019
                                 R                    ":"leaf01"}
    TCA_SENSOR_FAN_UPPER_1       TCA_SENSOR_FAN_UPPER {"hostname":"leaf01","s_na info             tca_slack_sensors  True   0                  Sun Dec  8 12:30:14 2019
                                                      me":"*"}
    TCA_SENSOR_TEMPERATURE_UPPER TCA_SENSOR_TEMPERATU {"hostname":"leaf01","s_na critical         tca_slack_sensors  True   10                 Sun Dec  8 14:05:24 2019
    _1                           RE_UPPER             me":"*"}
    TCA_TXBYTES_UPPER_1          TCA_TXBYTES_UPPER    {"ifname":"swp3","hostname critical         tca-tx-bytes-slack True   100                Sun Dec  8 14:19:46 2019
                                                      ":"leaf01"}
    TCA_TXMULTICAST_UPPER_1      TCA_TXMULTICAST_UPPE {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   0                  Sun Dec  8 16:40:14 2269
                                 R                    ":"leaf01"}
    

    Example 2: Display a Specific TCA Rule

    cumulus@switch:~$ netq show tca tca_id TCA_TXMULTICAST_UPPER_1
    Matching config_tca records:
    TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
    ---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
    TCA_TXMULTICAST_UPPER_1      TCA_TXMULTICAST_UPPE {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   0                  Sun Dec  8 16:40:14 2269
                                 R                    ":"leaf01"}
    

    Disable a TCA Rule

    Where the suppress option temporarily disables a TCA rule, you can use the is_active option to disable a rule indefinitely. To disable a rule, set the option to false. To reenable it, set the option to true.

    cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 is_active false
    Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
    

    Delete a TCA Rule

    If disabling a rule is not sufficient, and you want to remove a rule altogether, you can do so using the netq del tca command.

    cumulus@switch:~$ netq del tca tca_id TCA_RXBYTES_UPPER_1
    Successfully deleted TCA TCA_RXBYTES_UPPER_1
    

    Resolve Scope Conflicts

    There may be occasions where the scope defined by the multiple rules for a given TCA event may overlap each other. In such cases, the TCA rule with the most specific scope that is still true is used to generate the event.

    To clarify this, consider this example. Three events have occurred:

    NetQ attempts to match the TCA event against hostname and interface name with three TCA rules with different scopes:

    The result is:

    In summary:

    Input EventScope ParametersTCA Scope 1TCA Scope 2TCA Scope 3Scope Applied
    leaf01,swp1Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 3
    leaf01,swp3Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 2
    spine01,swp1Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 1

    Integrate NetQ with Your LDAP Server

    With this release and an administrator role, you are able to integrate the NetQ role-based access control (RBAC) with your lightweight directory access protocol (LDAP) server in on-premises deployments. NetQ maintains control over role-based permissions for the NetQ application. Currently there are two roles, admin and user. With the integration, user authentication is handled through LDAP and your directory service, such as Microsoft Active Directory, Kerberos, OpenLDAP, and Red Hat Directory Service. A copy of each user from LDAP is stored in the local NetQ database.

    Integrating with an LDAP server does not prevent you from configuring local users (stored and managed in the NetQ database) as well.

    Read the Overview to become familiar with LDAP configuration parameters, or skip to Create an LDAP Configuration if you are already an LDAP expert.

    Overview

    LDAP integration requires information about how to connect to your LDAP server, the type of authentication you plan to use, bind credentials, and, optionally, search attributes.

    Provide Your LDAP Server Information

    To connect to your LDAP server, you need the URI and bind credentials. The URI identifies the location of the LDAP server. It is comprised of a FQDN (fully qualified domain name) or IP address, and the port of the LDAP server where the LDAP client can connect. For example: myldap.mycompany.com or 192.168.10.2. Typically port 389 is used for connection over TCP or UDP. In production environments, a secure connection with SSL can be deployed. In this case, the port used is typically 636. Setting the Enable SSL toggle automatically sets the server port to 636.

    Specify Your Authentication Method

    Two methods of user authentication are available: anonymous and basic.

    If you are unfamiliar with the configuration of your LDAP server, contact your administrator to ensure you select the appropriate authentication method and credentials.

    Define User Attributes

    Two attributes are required to define a user entry in a directory:

    Optionally, you can specify the first name, last name, and email address of the user.

    Set Search Attributes

    While optional, specifying search scope indicates where to start and how deep a given user can search within the directory. The data to search for is specified in the search query.

    Search scope options include:

    A typical search query for users would be {userIdAttribute}={userId}.

    Now that you are familiar with the various LDAP configuration parameters, you can configure the integration of your LDAP server with NetQ using the instructions in the next section.

    Create an LDAP Configuration

    One LDAP server can be configured per bind DN (distinguished name). Once LDAP is configured, you can validate the connectivity (and configuration) and save the configuration.

    To create an LDAP configuration:

    1. Click , then select Management under Admin.

    2. Locate the LDAP Server Info card, and click Configure LDAP.

    3. Fill out the LDAP Server Configuration form according to your particular configuration. Refer to Overview for details about the various parameters.

      Note: Items with an asterisk (*) are required. All others are optional.

    4. Click Save to complete the configuration, or click Cancel to discard the configuration.

    LDAP config cannot be changed once configured. If you need to change the configuration, you must delete the current LDAP configuration and create a new one. Note that if you change the LDAP server configuration, all users created against that LDAP server remain in the NetQ database and continue to be visible, but are no longer viable. You must manually delete those users if you do not want to see them.

    Example LDAP Configurations

    A variety of example configurations are provided here. Scenarios 1-3 are based on using an OpenLDAP or similar authentication service. Scenario 4 is based on using the Active Directory service for authentication.

    Scenario 1: Base Configuration

    In this scenario, we are configuring the LDAP server with anonymous authentication, a User ID based on an email address, and a search scope of base.

    ParameterValue
    Host Server URLldap1.mycompany.com
    Host Server Port389
    AuthenticationAnonymous
    Base DNdc=mycompany,dc=com
    User IDemail
    Search ScopeBase
    Search Query{userIdAttribute}={userId}

    Scenario 2: Basic Authentication and Subset of Users

    In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network operators group, and a limited search scope.

    ParameterValue
    Host Server URLldap1.mycompany.com
    Host Server Port389
    AuthenticationBasic
    Admin Bind DNuid =admin,ou=netops,dc=mycompany,dc=com
    Admin Bind Passwordnqldap!
    Base DNdc=mycompany,dc=com
    User IDUID
    Search ScopeOne Level
    Search Query{userIdAttribute}={userId}

    Scenario 3: Scenario 2 with Widest Search Capability

    In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network administrators group, and an unlimited search scope.

    ParameterValue
    Host Server URL192.168.10.2
    Host Server Port389
    AuthenticationBasic
    Admin Bind DNuid =admin,ou=netadmin,dc=mycompany,dc=com
    Admin Bind Password1dap*netq
    Base DNdc=mycompany, dc=net
    User IDUID
    Search ScopeSubtree
    Search QueryuserIdAttribute}={userId}

    Scenario 4: Scenario 3 with Active Directory Service

    In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the given Active Directory group, and an unlimited search scope.

    ParameterValue
    Host Server URL192.168.10.2
    Host Server Port389
    AuthenticationBasic
    Admin Bind DNcn=netq,ou=45,dc=mycompany,dc=com
    Admin Bind Passwordnq&4mAd!
    Base DNdc=mycompany, dc=net
    User IDsAMAccountName
    Search ScopeSubtree
    Search Query{userIdAttribute}={userId}

    Add LDAP Users to NetQ

    1. Click , then select Management under Admin.

    2. Locate the User Accounts card, and click Manage.

    3. On the User Accounts tab, click Add User.

    4. Select LDAP User.

    5. Enter the user’s ID.

    6. Enter your administrator password.

    7. Click Search.

    8. If the user is found, the email address, first and last name fields are automatically filled in on the Add New User form. If searching is not enabled on the LDAP server, you must enter the information manually.

      If the fields are not automatically filled in, and searching is enabled on the LDAP server, you might require changes to the mapping file.

    9. Select the NetQ user role for this user, admin or user, in the User Type dropdown.

    10. Enter your admin password, and click Save, or click Cancel to discard the user account.

      LDAP user passwords are not stored in the NetQ database and are always authenticated against LDAP.

    11. Repeat these steps to add additional LDAP users.

    Remove LDAP Users from NetQ

    You can remove LDAP users in the same manner as local users.

    1. Click , then select Management under Admin.

    2. Locate the User Accounts card, and click Manage.

    3. Select the user or users you want to remove.

    4. Click in the Edit menu.

    If an LDAP user is deleted in LDAP it is not automatically deleted from NetQ; however, the login credentials for these LDAP users stop working immediately.

    Integrate NetQ with Grafana

    Switches collect statistics about the performance of their interfaces. The NetQ Agent on each switch collects these statistics every 15 seconds and then sends them to your NetQ Server or Appliance.

    NetQ only collects statistics for physical interfaces; it does not collect statistics for virtual (non-physical) interfaces, such as bonds, bridges, and VXLANs. Specifically, the NetQ Agent collects the following interface statistics:

    You can use Grafana version 6.x, an open source analytics and monitoring tool, to view these statistics. The fastest way to achieve this is by installing Grafana on an application server or locally per user, and then installing the NetQ plugin.

    If you do not have Grafana installed already, refer to grafana.com for instructions on installing and configuring the Grafana tool.

    Install NetQ Plugin for Grafana

    Use the Grafana CLI to install the NetQ plugin. For more detail about this command, refer to the Grafana CLI documentation.

    grafana-cli --pluginUrl https://netq-grafana-dsrc.s3-us-west-2.amazonaws.com/dist.zip plugins install netq-dashboard
    installing netq-dashboard @ 
    from: https://netq-grafana-dsrc.s3-us-west-2.amazonaws.com/dist.zip
    into: /usr/local/var/lib/grafana/plugins
    
    ✔ Installed netq-dashboard successfully
    
    Restart grafana after installing plugins . <service grafana-server restart>
    

    Set Up the NetQ Data Source

    Now that you have the plugin installed, you need to configure access to the NetQ data source.

    1. Open the Grafana user interface.

    2. Log in using your application credentials.

      The Home Dashboard appears.

    3. Click Add data source or > Data Sources.

    4. Enter Net-Q in the search box or scroll down to the Other category, and select Net-Q from there.

    5. Enter Net-Q into the Name field.

    6. Enter the URL used to access the database:

      • Cloud: api.netq.cumulusnetworks.com
      • On-premises: <hostname-or-ipaddr-of-netq-appl-or-vm>/api
      • Cumulus in the Cloud (CITC): air.netq.cumulusnetworks.com
    7. Enter your credentials (the ones used to login)

    8. For cloud deployments only, if you have more than one premises configured, you can select the premises you want to view, as follows:

      • If you leave the Premises field blank, the first premises name is selected by default

      • If you enter a premises name, that premises is selected for viewing

        Note: If multiple premises are configured with the same name, then the first premises of that name is selected for viewing

    9. Click Save & Test

    Create Your NetQ Dashboard

    With the data source configured, you can create a dashboard with the transmit and receive statistics of interest to you.

    To create your dashboard:

    1. Click to open a blank dashboard.

    2. Click (Dashboard Settings) at the top of the dashboard.

    3. Click Variables.

    4. Enter hostname into the Name field.

    5. Enter Hostname into the Label field.

    6. Select Net-Q from the Data source list.

    7. Enter hostname into the Query field.

    8. Click Add.

      You should see a preview at the bottom of the hostname values.

    9. Click to return to the new dashboard.

    10. Click Add Query.

    11. Select Net-Q from the Query source list.

    12. Select the interface statistic you want to view from the Metric list.

    13. Click the General icon.

    14. Select hostname from the Repeat list.

    15. Set any other parameters around how to display the data.

    16. Return to the dashboard.

    17. Add additional panels with other metrics to complete your dashboard.

    Analyze the Data

    data.

    For reference, this example shows a dashboard with all of the available statistics.

    1. Select the hostname from the variable list at the top left of the charts to see the statistics for that switch or host.

    2. Review the statistics, looking for peaks and valleys, unusual patterns, and so forth.

    3. Explore the data more by modifying the data view in one of several ways using the dashboard tool set:

      • Select a different time period for the data by clicking the forward or back arrows. The default time range is dependent on the width of your browser window.
      • Zoom in on the dashboard by clicking the magnifying glass.
      • Manually refresh the dashboard data, or set an automatic refresh rate for the dashboard from the down arrow.
      • Add a new variable by clicking the cog wheel, then selecting Variables
      • Add additional panels
      • Click any chart title to edit or remove it from the dashboard
      • Rename the dashboard by clicking the cog wheel and entering the new name

    Cumulus NetQ API User Guide

    The NetQ API provides access to key telemetry and system monitoring data gathered about the performance and operation of your data center network and devices so that you can view that data in your internal or third-party analytic tools. The API gives you access to the health of individual switches, network protocols and services, and views of network-wide inventory and events.

    This guide provides an overview of the API framework and some examples of how to use the API to extract the data you need. Descriptions of each endpoint and model parameter are contained in the API .json files.

    For information regarding new features, improvements, bug fixes, and known issues present in this release, refer to the release notes.

    API Organization

    The Cumulus NetQ API provides endpoints for:

    Each endpoint has its own API. You can make requests for all data and all devices or you can filter the request by a given hostname.

    Each API returns a predetermined set of data as defined in the API models.

    Get Started

    You can access the API gateway and execute requests from a terminal interface against your NetQ Platform or NetQ Appliance through port 32708.

    Log In and Authentication

    Use your login credentials that were provided as part of the installation process. For this release, the default is username admin and password admin.

    To log in and obtain authorization:

    1. Open a terminal window.

    2. Enter the following curl command.

      <computer-name>:~ <username>$ curl --insecure -X POST "https://<netq.domain>:32708/netq/auth/v1/login" -H "Content-Type: application/json" -d '{"username":"admin","password":"admin"}'
      {"premises":[{"opid":0,"name":"OPID0"}],"access_token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyIjoiYWRtaW4iLCJvcGlkIjowLCJyb2xlIjoiYWRtaW4iLCJleHBpcmVzQXQiOjE1NTYxMjUzNzgyODB9.\_D2Ibhmo_BWSfAMnF2FzddjndTn8LP8CAFFGIj5tn0A","customer_id":0,"id":"admin","expires_at":1556125378280,"terms_of_use_accepted":true}
      
    3. Copy the access token for use in making data requests.

    API Requests

    We will use curl to execute our requests. Each request contains an API method (GET, POST, etc.), the address and API object to query, a variety of headers, and sometimes a body. In the log in step you used above:

    We have used the insecure option to work around any certificate issues with our development configuration. You would likely not use this option.

    API Responses

    A NetQ API response is comprised of a status code, any relevant error codes (if unsuccessful), and the collected data (if successful).

    The following HTTP status codes might be presented in the API responses:

    CodeNameDescriptionAction
    200SuccessRequest was successfully processed.Review response
    400Bad RequestInvalid input was detected in request.Check the syntax of your request and make sure it matches the schema
    401UnauthorizedAuthentication has failed or credentials were not provided.Provide or verify your credentials, or request access from your administrator
    403ForbiddenRequest was valid, but user may not have needed permissions.Verify your credentials or request an account from your administrator
    404Not FoundRequested resource could not be found.Try the request again after a period of time or verify status of resource
    409ConflictRequest cannot be processed due to conflict in current state of the resource.Verify status of resource and remove conflict
    500Internal Server ErrorUnexpected condition has occurred.Perform general troubleshooting and try the request again
    503Service UnavailableThe service being requested is currently unavailable.Verify the status of the NetQ Platform or Appliance, and the associated service

    Example Requests and Responses

    Some command requests and their responses are shown here, but feel free to run your own requests. To run a request, you will need your authorization token. We have piped our responses through a python tool to make the responses more readable. You may chose to do so as well or not.

    To view all of the endpoints and their associated requests and responses, refer to Cumulus NetQ API User Guide.

    Get Network-wide Status of the BGP Service

    Make your request to the bgp endpoint to obtain status information from all nodes running the BGP service, as follows:

    curl --insecure -X GET "<https://<netq.domain>:32708/netq/telemetry/v1/object/bgp " -H "Content-Type: application/json " -H "Authorization: <auth-token> " | python -m json.tool
     
    [
      {
        "ipv6_pfx_rcvd": 0,
        "peer_router_id": "0.0.0.0",
        "objid": "",
        "upd8_tx": 0,
        "hostname": "exit-1",
        "timestamp": 1556037420723,
        "peer_asn": 0,
        "state": "NotEstd",
        "vrf": "DataVrf1082",
        "rx_families": [],
        "ipv4_pfx_rcvd": 0,
        "conn_dropped": 0,
        "db_state": "Update",
        "up_time": 0,
        "last_reset_time": 0,
        "tx_families": [],
        "reason": "N/A",
        "vrfid": 13,
        "asn": 655536,
        "opid": 0,
        "peer_hostname": "",
        "upd8_rx": 0,
        "peer_name": "swp7.4",
        "evpn_pfx_rcvd": 0,
        "conn_estd": 0
      },
      {
        "ipv6_pfx_rcvd": 0,
        "peer_router_id": "0.0.0.0",
        "objid": "",
        "upd8_tx": 0,
        "hostname": "exit-1",
        "timestamp": 1556037420674,
        "peer_asn": 0,
        "state": "NotEstd",
        "vrf": "default",
        "rx_families": [],
        "ipv4_pfx_rcvd": 0,
        "conn_dropped": 0,
        "db_state": "Update",
        "up_time": 0,
        "last_reset_time": 0,
        "tx_families": [],
        "reason": "N/A",
        "vrfid": 0,
        "asn": 655536,
        "opid": 0,
        "peer_hostname": "",
        "upd8_rx": 0,
        "peer_name": "swp7",
        "evpn_pfx_rcvd": 0,
        "conn_estd": 0
      },
      {
        "ipv6_pfx_rcvd": 24,
        "peer_router_id": "27.0.0.19",
        "objid": "",
        "upd8_tx": 314,
        "hostname": "exit-1",
        "timestamp": 1556037420665,
        "peer_asn": 655435,
        "state": "Established",
        "vrf": "default",
        "rx_families": [
          "ipv4",
          "ipv6",
          "evpn"
        ],
        "ipv4_pfx_rcvd": 26,
        "conn_dropped": 0,
        "db_state": "Update",
        "up_time": 1556036850000,
        "last_reset_time": 0,
        "tx_families": [
          "ipv4",
          "ipv6",
          "evpn"
        ],
        "reason": "N/A",
        "vrfid": 0,
        "asn": 655536,
        "opid": 0,
        "peer_hostname": "spine-1",
        "upd8_rx": 321,
        "peer_name": "swp3",
        "evpn_pfx_rcvd": 354,
        "conn_estd": 1
      },
    ...
    

    Get Status of EVPN on a Specific Switch

    Make your request to the evpn/hostname endpoint to view the status of all EVPN sessions running on that node. This example uses the server01 node.

    curl -X GET "https://<netq.domain>:32708/netq/telemetry/v1/object/evpn/hostname/server01" -H "Content-Type: application/json" -H "Authorization: <auth-token>" | python -m json.tool
     
    [
      {
        "import_rt": "[\"197:42\"]",
        "vni": 42,
        "rd": "27.0.0.22:2",
        "hostname": "server01",
        "timestamp": 1556037403853,
        "adv_all_vni": true,
        "export_rt": "[\"197:42\"]",
        "db_state": "Update",
        "in_kernel": true,
        "adv_gw_ip": "Disabled",
        "origin_ip": "27.0.0.22",
        "opid": 0,
        "is_l3": false
      },
      {
        "import_rt": "[\"197:37\"]",
        "vni": 37,
        "rd": "27.0.0.22:8",
        "hostname": "server01",
        "timestamp": 1556037403811,
        "adv_all_vni": true,
        "export_rt": "[\"197:37\"]",
        "db_state": "Update",
        "in_kernel": true,
        "adv_gw_ip": "Disabled",
        "origin_ip": "27.0.0.22",
        "opid": 0,
        "is_l3": false
      },
      {
        "import_rt": "[\"197:4001\"]",
        "vni": 4001,
        "rd": "6.0.0.194:5",
        "hostname": "server01",
        "timestamp": 1556036360169,
        "adv_all_vni": true,
        "export_rt": "[\"197:4001\"]",
        "db_state": "Refresh",
        "in_kernel": true,
        "adv_gw_ip": "Disabled",
        "origin_ip": "27.0.0.22",
        "opid": 0,
        "is_l3": true
      },
    ...
    

    Get Status on All Interfaces at a Given Time

    Make your request to the interfaces endpoint to view the status of all interfaces. By specifying the eq-timestamp option and entering a date and time in epoch format, you indicate the data for that time (versus in the last hour by default), as follows:

    curl -X GET "https://<netq.domain>:32708/netq/telemetry/v1/object/interface?eq_timestamp=1556046250" -H "Content-Type: application/json" -H "Authorization: <auth-token>" | python -m json.tool
     
    [
      {
        "hostname": "exit-1",
        "timestamp": 1556046270494,
        "state": "up",
        "vrf": "DataVrf1082",
        "last_changed": 1556037405259,
        "ifname": "swp3.4",
        "opid": 0,
        "details": "MTU: 9202",
        "type": "vlan"
      },
      {
        "hostname": "exit-1",
        "timestamp": 1556046270496,
        "state": "up",
        "vrf": "DataVrf1081",
        "last_changed": 1556037405320,
        "ifname": "swp7.3",
        "opid": 0,
        "details": "MTU: 9202",
        "type": "vlan"
      },
      {
        "hostname": "exit-1",
        "timestamp": 1556046270497,
        "state": "up",
        "vrf": "DataVrf1080",
        "last_changed": 1556037405310,
        "ifname": "swp7.2",
        "opid": 0,
        "details": "MTU: 9202",
        "type": "vlan"
      },
      {
        "hostname": "exit-1",
        "timestamp": 1556046270499,
        "state": "up",
        "vrf": "",
        "last_changed": 1556037405315,
        "ifname": "DataVrf1081",
        "opid": 0,
        "details": "table: 1081, MTU: 65536, Members:  swp7.3,  DataVrf1081,  swp4.3,  swp6.3,  swp5.3,  swp3.3, ",
        "type": "vrf"
      },
    ...
    

    Get a List of All Devices Being Monitored

    Make your request to the inventory endpoint to get a listing of all monitored nodes and their configuration information, as follows:

    curl -X GET "https://<netq.domain>:32708/netq/telemetry/v1/object/inventory" -H "Content-Type: application/json" -H "Authorization: <auth-token>" | python -m json.tool
     
    [
      {
        "hostname": "exit-1",
        "timestamp": 1556037425658,
        "asic_model": "A-Z",
        "agent_version": "2.1.1-cl3u16~1556035513.afedb69",
        "os_version": "A.2.0",
        "license_state": "ok",
        "disk_total_size": "10 GB",
        "os_version_id": "A.2.0",
        "platform_model": "A_VX",
        "memory_size": "2048.00 MB",
        "asic_vendor": "AA Inc",
        "cpu_model": "A-SUBLEQ",
        "asic_model_id": "N/A",
        "platform_vendor": "A Systems",
        "asic_ports": "N/A",
        "cpu_arch": "x86_64",
        "cpu_nos": "2",
        "platform_mfg_date": "N/A",
        "platform_label_revision": "N/A",
        "agent_state": "fresh",
        "cpu_max_freq": "N/A",
        "platform_part_number": "3.7.6",
        "asic_core_bw": "N/A",
        "os_vendor": "CL",
        "platform_base_mac": "00:01:00:00:01:00",
        "platform_serial_number": "00:01:00:00:01:00"
      },
      {
        "hostname": "exit-2",
        "timestamp": 1556037432361,
        "asic_model": "C-Z",
        "agent_version": "2.1.1-cl3u16~1556035513.afedb69",
        "os_version": "C.2.0",
        "license_state": "N/A",
        "disk_total_size": "30 GB",
        "os_version_id": "C.2.0",
        "platform_model": "C_VX",
        "memory_size": "2048.00 MB",
        "asic_vendor": "CC Inc",
        "cpu_model": "C-CRAY",
        "asic_model_id": "N/A",
        "platform_vendor": "C Systems",
        "asic_ports": "N/A",
        "cpu_arch": "x86_64",
        "cpu_nos": "2",
        "platform_mfg_date": "N/A",
        "platform_label_revision": "N/A",
        "agent_state": "fresh",
        "cpu_max_freq": "N/A",
        "platform_part_number": "3.7.6",
        "asic_core_bw": "N/A",
        "os_vendor": "CL",
        "platform_base_mac": "00:01:00:00:02:00",
        "platform_serial_number": "00:01:00:00:02:00"
      },
      {
        "hostname": "firewall-1",
        "timestamp": 1556037438002,
        "asic_model": "N/A",
        "agent_version": "2.1.0-ub16.04u15~1555608012.1d98892",
        "os_version": "16.04.1 LTS (Xenial Xerus)",
        "license_state": "N/A",
        "disk_total_size": "3.20 GB",
        "os_version_id": "(hydra-poc-01 /tmp/purna/Kleen-Gui1/)\"16.04",
        "platform_model": "N/A",
        "memory_size": "4096.00 MB",
        "asic_vendor": "N/A",
        "cpu_model": "QEMU Virtual  version 2.2.0",
        "asic_model_id": "N/A",
        "platform_vendor": "N/A",
        "asic_ports": "N/A",
        "cpu_arch": "x86_64",
        "cpu_nos": "2",
        "platform_mfg_date": "N/A",
        "platform_label_revision": "N/A",
        "agent_state": "fresh",
        "cpu_max_freq": "N/A",
        "platform_part_number": "N/A",
        "asic_core_bw": "N/A",
        "os_vendor": "Ubuntu",
        "platform_base_mac": "N/A",
        "platform_serial_number": "N/A"
      },
    ...
    

    View the API

    For simplicity, all of the endpoint APIs are combined into a single json-formatted file. There have been no changes to the file in the NetQ 3.0.0 release.

    netq-300.json
    {
      "swagger": "2.0",
      "info": {
        "description": "This API is used to gain access to data collected by the Cumulus NetQ Platform and Agents for integration with third-party monitoring and analytics  software. Integrators can pull data for daily monitoring of network protocols and services performance, inventory status, and system-wide events.",
        "version": "1.1",
        "title": "Cumulus NetQ 3.0.0 API",
        "termsOfService": "https://cumulusnetworks.com/legal/"
      },
      "host": "<netq-platform-or-appliance-ipaddress>:32708",
      "basePath": "/netq/telemetry/v1",
      "externalDocs": {
        "description": "API Documentation",
        "url": "https://docs.nvidia.com/networking-ethernet-software/cumulus-netq/Cumulus-NetQ-Integration-Guide/API-User-Guide/"
      },
      "schemes": [
        "https"
      ],
      "paths": {
        "/object/address": {
          "get": {
            "tags": [
              "address"
            ],
            "summary": "Get all addresses for all network devices",
            "description": "Retrieves all IPv4, IPv6 and MAC addresses deployed on switches and hosts in your network running NetQ Agents.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Address"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/address/hostname/{hostname}": {
          "get": {
            "tags": [
              "address"
            ],
            "summary": "Get all addresses for a given network device by hostname",
            "description": "Retrieves IPv4, IPv6, and MAC addresses of a network device (switch or host) specified by its hostname.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Address"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/login": {
          "post": {
            "tags": [
              "auth"
            ],
            "summary": "Perform authenticated user login to NetQ",
            "description": "Sends user-provided login credentials (username and password) to the NetQ Authorization service for validation. Grants access to the NetQ platform and software if user credentials are valid.",
            "operationId": "login",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "in": "body",
                "name": "body",
                "description": "User credentials provided for login request; username and password.",
                "required": true,
                "schema": {
                  "$ref": "#/definitions/LoginRequest"
                }
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "$ref": "#/definitions/LoginResponse"
                }
              },
              "401": {
                "description": "Invalid credentials",
                "schema": {
                  "$ref": "#/definitions/ErrorResponse"
                }
              }
            }
          }
        },
        "/object/bgp": {
          "get": {
            "tags": [
              "bgp"
            ],
            "summary": "Get all BGP session information for all network devices",
            "description": "For every Border Gateway Protocol (BGP) session running on the network, retrieves local node hostname, remote peer hostname, interface, router ID, and ASN, timestamp, VRF, connection state, IP and EVPN prefixes, and so forth. Refer to the BGPSession model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/BgpSession"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/bgp/hostname/{hostname}": {
          "get": {
            "tags": [
              "bgp"
            ],
            "summary": "Get all BGP session information for a given network device by hostname",
            "description": "For every BGP session running on the network device, retrieves local node hostname, remote peer hostname, interface, router ID, and ASN, timestamp, VRF, connection state, IP and EVPN prefixes, and so forth. Refer to the BGPSession model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/BgpSession"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/clag": {
          "get": {
            "tags": [
              "clag"
            ],
            "summary": "Get all CLAG session information for all network devices",
            "description": "For every Cumulus multiple Link Aggregation (CLAG) session running on the network, retrieves local node hostname, CLAG sysmac, remote peer role, state, and interface, backup IP address, bond status, and so forth. Refer to the ClagSessionInfo model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/ClagSessionInfo"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/clag/hostname/{hostname}": {
          "get": {
            "tags": [
              "clag"
            ],
            "summary": "Get all CLAG session information for a given network device by hostname",
            "description": "For every CLAG session running on the network device, retrieves local node hostname, CLAG sysmac, remote peer role, state, and interface, backup IP address, bond status, and so forth. Refer to the ClagSessionInfo model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/ClagSessionInfo"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/events": {
          "get": {
            "tags": [
              "events"
            ],
            "summary": "Get all events from across the entire network",
            "description": "Retrieves all alarm (critical severity) and informational (warning, info and debug severity) events from all network devices and services.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "gt_timestamp",
                "in": "query",
                "description": "Used in combination with lt_timestamp, sets the lower limit of the time range to display. Uses Epoch format. Cannot be used with eq_timestamp. For example, to display events between Monday February 11, 2019 at 1:00am and Tuesday February 12, 2019 at 1:00am, lt_timestamp would be entered as 1549864800 and gt_timestamp would be entered as 1549951200.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "lt_timestamp",
                "in": "query",
                "description": "Used in combination with gt_timestamp, sets the upper limit of the time range to display. Uses Epoch format. Cannot be used with eq_timestamp. For example, to display events between Monday February 11, 2019 at 1:00am and Tuesday February 12, 2019 at 1:00am, lt_timestamp would be entered as 1549864800 and gt_timestamp would be entered as 1549951200.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "type": "string"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/evpn": {
          "get": {
            "tags": [
              "evpn"
            ],
            "summary": "Get all EVPN session information from across the entire network",
            "description": "For every Ethernet Virtual Private Network (EVPN) session running on the network, retrieves hostname, VNI status, origin IP address, timestamp, export and import routes, and so forth. Refer to the Evpn model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Evpn"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/evpn/hostname/{hostname}": {
          "get": {
            "tags": [
              "evpn"
            ],
            "summary": "Get all EVPN session information from a given network device by hostname",
            "description": "For every EVPN session running on the network device, retrieves hostname, VNI status, origin IP address, timestamp, export and import routes, and so forth. Refer to the Evpn model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Evpn"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/interface": {
          "get": {
            "tags": [
              "interface"
            ],
            "summary": "Get software interface information for all network devices",
            "description": "Retrieves information about all software interfaces, including type and name of the interfaces, the hostnames of the device where they reside, state, VRF, and so forth. Refer to the Interface model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Interface"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/interface/hostname/{hostname}": {
          "get": {
            "tags": [
              "interface"
            ],
            "summary": "Get software interface information for a given network device by hostname",
            "description": "Retrieves information about all software interfaces on a network device, including type and name of the interfaces, state, VRF, and so forth. Refer to the Interface model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Interface"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/inventory": {
          "get": {
            "tags": [
              "inventory"
            ],
            "summary": "Get component inventory information from all network devices",
            "description": "Retrieves the hardware and software component information, such as ASIC, platform, and OS vendor and version information, for all switches and hosts in your network. Refer to the InventoryOutput model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "$ref": "#/definitions/InventoryOutput"
                }
              },
              "400": {
                "description": "Invalid Input"
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/inventory/hostname/{hostname}": {
          "get": {
            "tags": [
              "inventory"
            ],
            "summary": "Get component inventory information from a given network device by hostname",
            "description": "Retrieves the hardware and software component information, such as ASIC, platform, and OS vendor and version information, for the given switch or host in your network. Refer to the InventoryOutput model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "$ref": "#/definitions/InventoryOutput"
                }
              },
              "400": {
                "description": "Invalid Input"
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/lldp": {
          "get": {
            "tags": [
              "lldp"
            ],
            "summary": "Get LLDP information for all network devices",
            "description": "Retrieves Link Layer Discovery Protocol (LLDP) information, such as hostname, interface name, peer hostname, interface name, bridge, router, OS, timestamp, for all switches and hosts in the network. Refer to the LLDP model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/LLDP"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/lldp/hostname/{hostname}": {
          "get": {
            "tags": [
              "lldp"
            ],
            "summary": "Get LLDP information for a given network device by hostname",
            "description": "Retrieves Link Layer Discovery Protocol (LLDP) information, such as hostname, interface name, peer hostname, interface name, bridge, router, OS, timestamp, for the given switch or host. Refer to the LLDP model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/LLDP"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/macfdb": {
          "get": {
            "tags": [
              "macfdb"
            ],
            "summary": "Get all MAC FDB information for all network devices",
            "description": "Retrieves all MAC address forwarding database (MACFDB) information for all switches and hosts in the network, such as MAC address, timestamp, next hop, destination, port, and VLAN. Refer to MacFdb model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/MacFdb"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/macfdb/hostname/{hostname}": {
          "get": {
            "tags": [
              "macfdb"
            ],
            "summary": "Get all MAC FDB information for a given network device by hostname",
            "description": "Retrieves all MAC address forwarding database (MACFDB) information for a given switch or host in the network, such as MAC address, timestamp, next hop, destination, port, and VLAN. Refer to MacFdb model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/MacFdb"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/mstp": {
          "get": {
            "tags": [
              "mstp"
            ],
            "summary": "Get all MSTP information from all network devices",
            "description": "Retrieves all Multiple Spanning Tree Protocol (MSTP) information, including bridge and port information, changes made to topology, and so forth for all switches and hosts in the network. Refer to MstpInfo model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/MstpInfo"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/mstp/hostname/{hostname}": {
          "get": {
            "tags": [
              "mstp"
            ],
            "summary": "Get all MSTP information from a given network device by hostname",
            "description": "Retrieves all MSTP information, including bridge and port information, changes made to topology, and so forth for a given switch or host in the network.  Refer to MstpInfo model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/MstpInfo"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/neighbor": {
          "get": {
            "tags": [
              "neighbor"
            ],
            "summary": "Get neighbor information for all network devices",
            "description": "Retrieves neighbor information, such as hostname, addresses, VRF, interface name and index, for all switches and hosts in the network.  Refer to Neighbor model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Neighbor"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/neighbor/hostname/{hostname}": {
          "get": {
            "tags": [
              "neighbor"
            ],
            "summary": "Get neighbor information for a given network device by hostname",
            "description": "Retrieves neighbor information, such as hostname, addresses, VRF, interface name and index, for a given switch or host in the network.  Refer to Neighbor model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Neighbor"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/node": {
          "get": {
            "tags": [
              "node"
            ],
            "summary": "Get device status for all network devices",
            "description": "Retrieves hostname, uptime, last update, boot and re-initialization time, version, NTP and DB state, timestamp, and its current state (active or not) for all switches and hosts in the network.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/NODE"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/node/hostname/{hostname}": {
          "get": {
            "tags": [
              "node"
            ],
            "summary": "Get device status for a given network device by hostname",
            "description": "Retrieves hostname, uptime, last update, boot and re-initialization time, version, NTP and DB state, timestamp, and its current state (active or not) for a given switch or host in the network.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/NODE"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/ntp": {
          "get": {
            "tags": [
              "ntp"
            ],
            "summary": "Get all NTP information for all network devices",
            "description": "Retrieves all Network Time Protocol (NTP) configuration and status information, such as whether the service is running and if it is in time synchronization, for all switches and hosts in the network. Refer to the NTP model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/NTP"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/ntp/hostname/{hostname}": {
          "get": {
            "tags": [
              "ntp"
            ],
            "summary": "Get all NTP information for a given network device by hostname",
            "description": "Retrieves all Network Time Protocol (NTP) configuration and status information, such as whether the service is running and if it is in time synchronization, for a given switch or host in the network. Refer to the NTP model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/NTP"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/port": {
          "get": {
            "tags": [
              "port"
            ],
            "summary": "Get all information for all physical ports on all network devices",
            "description": "Retrieves all physical port information, such as speed, connector, vendor, part and serial number, and FEC support, for all network devices. Refer to Port model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Port"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/port/hostname/{hostname}": {
          "get": {
            "tags": [
              "port"
            ],
            "summary": "Get all information for all physical ports on a given network device by hostname",
            "description": "Retrieves all physical port information, such as speed, connector, vendor, part and serial number, and FEC support, for a given switch or host in the network. Refer to Port model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Port"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/route": {
          "get": {
            "tags": [
              "route"
            ],
            "summary": "Get all route information for all network devices",
            "description": "Retrieves route information, such as VRF, source, next hops, origin, protocol, and prefix, for all switches and hosts in the network. Refer to Route model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Route"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/route/hostname/{hostname}": {
          "get": {
            "tags": [
              "route"
            ],
            "summary": "Get all route information for a given network device by hostname",
            "description": "Retrieves route information, such as VRF, source, next hops, origin, protocol, and prefix, for a given switch or host in the network. Refer to Route model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Route"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/sensor": {
          "get": {
            "tags": [
              "sensor"
            ],
            "summary": "Get all sensor information for all network devices",
            "description": "Retrieves data from fan, temperature, and power supply unit sensors, such as their name, state, and threshold status, for all switches and hosts in the network. Refer to Sensor model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Sensor"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/sensor/hostname/{hostname}": {
          "get": {
            "tags": [
              "sensor"
            ],
            "summary": "Get all sensor information for a given network device by hostname",
            "description": "Retrieves data from fan, temperature, and power supply unit sensors, such as their name, state, and threshold status, for a given switch or host in the network. Refer to Sensor model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Sensor"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/services": {
          "get": {
            "tags": [
              "services"
            ],
            "summary": "Get all services information for all network devices",
            "description": "Retrieves services information, such as XXX, for all switches and hosts in the network. Refer to Services for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Services"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/services/hostname/{hostname}": {
          "get": {
            "tags": [
              "services"
            ],
            "summary": "Get all services information for a given network device by hostname",
            "description": "Retrieves services information, such as XXX, for a given switch or host in the network. Refer to Services for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Services"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/vlan": {
          "get": {
            "tags": [
              "vlan"
            ],
            "summary": "Get all VLAN information for all network devices",
            "description": "Retrieves VLAN information, such as hostname, interface name, associated VLANs, ports, and time of last change, for all switches and hosts in the network. Refer to Vlan model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Vlan"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        },
        "/object/vlan/hostname/{hostname}": {
          "get": {
            "tags": [
              "vlan"
            ],
            "summary": "Get all VLAN information for a given network device by hostname",
            "description": "Retrieves VLAN information, such as hostname, interface name, associated VLANs, ports, and time of last change, for a given switch or  host in the network. Refer to Vlan model for all data collected.",
            "produces": [
              "application/json"
            ],
            "parameters": [
              {
                "name": "hostname",
                "in": "path",
                "description": "User-specified name for a network switch or host. For example, leaf01, spine04, host-6, engr-1, tor-22.",
                "required": true,
                "type": "string"
              },
              {
                "name": "eq_timestamp",
                "in": "query",
                "description": "Display results for a given time. Time must be entered in Epoch format. For example, to display the results for Monday February 13, 2019 at 1:25 pm, use a time converter and enter 1550082300.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "count",
                "in": "query",
                "description": "Number of entries to display starting from the offset value. For example, a count of 100 displays 100 entries at a time.",
                "required": false,
                "type": "integer"
              },
              {
                "name": "offset",
                "in": "query",
                "description": "Used in combination with count, offset specifies the starting location within the set of entries returned. For example, an offset of 100 would display results beginning with entry 101.",
                "required": false,
                "type": "integer"
              }
            ],
            "responses": {
              "200": {
                "description": "successful operation",
                "schema": {
                  "type": "array",
                  "items": {
                    "$ref": "#/definitions/Vlan"
                  }
                }
              }
            },
            "security": [
              {
                "jwt": []
              }
            ]
          }
        }
      },
      "securityDefinitions": {
        "jwt": {
          "type": "apiKey",
          "name": "Authorization",
          "in": "header"
        }
      },
      "definitions": {
        "Address": {
          "description": "This model contains descriptions of the data collected and returned by the Address endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "ifname": {
              "type": "string",
              "description": "Name of a software (versus physical) interface"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "prefix": {
              "type": "string",
              "description": "Address prefix for IPv4, IPv6, or EVPN traffic"
            },
            "mask": {
              "type": "integer",
              "format": "int32",
              "description": "Address mask for IPv4, IPv6, or EVPN traffic"
            },
            "is_ipv6": {
              "type": "boolean",
              "description": "Indicates whether address is an IPv6 address (true) or not (false)"
            },
            "vrf": {
              "type": "string",
              "description": "Virtual Route Forwarding interface name"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "BgpSession": {
          "description": "This model contains descriptions of the data collected and returned by the BGP endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "peer_name": {
              "type": "string",
              "description": "Interface name or hostname for a peer device"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "state": {
              "type": "string",
              "description": "Current state of the BGP session. Values include established and not established."
            },
            "peer_router_id": {
              "type": "string",
              "description": "If peer is a router, IP address of router"
            },
            "peer_asn": {
              "type": "integer",
              "format": "int64",
              "description": "Peer autonomous system number (ASN), identifier for a collection of IP networks and routers"
            },
            "peer_hostname": {
              "type": "string",
              "description": "User-defined name for the peer device"
            },
            "asn": {
              "type": "integer",
              "format": "int64",
              "description": "Host autonomous system number (ASN), identifier for a collection of IP networks and routers"
            },
            "reason": {
              "type": "string",
              "description": "Text describing the cause of, or trigger for, an event"
            },
            "ipv4_pfx_rcvd": {
              "type": "integer",
              "format": "int32",
              "description": "Address prefix received for an IPv4 address"
            },
            "ipv6_pfx_rcvd": {
              "type": "integer",
              "format": "int32",
              "description": "Address prefix received for an IPv6 address"
            },
            "evpn_pfx_rcvd": {
              "type": "integer",
              "format": "int32",
              "description": "Address prefix received for an EVPN address"
            },
            "last_reset_time": {
              "type": "number",
              "format": "float",
              "description": "Date and time at which the session was last established or reset"
            },
            "up_time": {
              "type": "number",
              "format": "float",
              "description": "Number of seconds the session has been established, in EPOCH notation"
            },
            "conn_estd": {
              "type": "integer",
              "format": "int32",
              "description": "Number of connections established for a given session"
            },
            "conn_dropped": {
              "type": "integer",
              "format": "int32",
              "description": "Number of dropped connections for a given session"
            },
            "upd8_rx": {
              "type": "integer",
              "format": "int32",
              "description": "Count of protocol messages received"
            },
            "upd8_tx": {
              "type": "integer",
              "format": "int32",
              "description": "Count of protocol messages transmitted"
            },
            "vrfid": {
              "type": "integer",
              "format": "int32",
              "description": "Integer identifier of the VRF interface when used"
            },
            "vrf": {
              "type": "string",
              "description": "Name of the Virtual Route Forwarding interface"
            },
            "tx_families": {
              "type": "string",
              "description": "Address families supported for the transmit session channel. Values include ipv4, ipv6, and evpn."
            },
            "rx_families": {
              "type": "string",
              "description": "Address families supported for the receive session channel. Values include ipv4, ipv6, and evpn."
            }
          }
        },
        "ClagSessionInfo": {
          "description": "This model contains descriptions of the data collected and returned by the CLAG endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "clag_sysmac": {
              "type": "string",
              "description": "Unique MAC address for each bond interface pair. This must be a value between 44:38:39:ff:00:00 and 44:38:39:ff:ff:ff."
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the CLAG session was started, deleted, updated, or marked dead (device went down)"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "peer_role": {
              "type": "string",
              "description": "Role of the peer device. Values include primary and secondary."
            },
            "peer_state": {
              "type": "boolean",
              "description": "Indicates if peer device is up (true) or down (false)"
            },
            "peer_if": {
              "type": "string",
              "description": "Name of the peer interface used for the session"
            },
            "backup_ip_active": {
              "type": "boolean",
              "description": "Indicates whether the backup IP address has been specified and is active (true) or not (false)"
            },
            "backup_ip": {
              "type": "string",
              "description": "IP address of the interface to use if the peerlink (or bond) goes down"
            },
            "single_bonds": {
              "type": "string",
              "description": "Identifies a set of interfaces connecting to only one of the two switches in the bond"
            },
            "dual_bonds": {
              "type": "string",
              "description": "Identifies a set of interfaces connecting to both switches in the bond"
            },
            "conflicted_bonds": {
              "type": "string",
              "description": "Identifies the set of interfaces in a bond that do not match on each end of the bond"
            },
            "proto_down_bonds": {
              "type": "string",
              "description": "Interface on the switch brought down by the clagd service. Value is blank if no interfaces are down due to the clagd service."
            },
            "vxlan_anycast": {
              "type": "string",
              "description": "Anycast IP address used for VXLAN termination"
            },
            "role": {
              "type": "string",
              "description": "Role of the host device. Values include primary and secondary."
            }
          }
        },
        "ErrorResponse": {
          "description": "Standard error response",
          "type": "object",
          "properties": {
            "message": {
              "type": "string",
              "description": "One or more errors have been encountered during the processing of the associated request"
            }
          }
        },
        "Evpn": {
          "description": "This model contains descriptions of the data collected and returned by the EVPN endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "vni": {
              "type": "integer",
              "format": "int32",
              "description": "Name of the virtual network instance (VNI) where session is running"
            },
            "origin_ip": {
              "type": "string",
              "description": "Host device's local VXLAN tunnel IP address for the EVPN instance"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the session was started, deleted, updated or marked as dead (device is down)"
            },
            "rd": {
              "type": "string",
              "description": "Route distinguisher used in the filtering mechanism for BGP route exchange"
            },
            "export_rt": {
              "type": "string",
              "description": "IP address and port of the export route target used in the filtering mechanism for BGP route exchange"
            },
            "import_rt": {
              "type": "string",
              "description": "IP address and port of the import route target used in the filtering mechanism for BGP route exchange"
            },
            "in_kernel": {
              "type": "boolean",
              "description": "Indicates whether the associated VNI is in the kernel (in kernel) or not (not in kernel)"
            },
            "adv_all_vni": {
              "type": "boolean",
              "description": "Indicates whether the VNI state is advertising all VNIs (true) or not (false)"
            },
            "adv_gw_ip": {
              "type": "string",
              "description": "Indicates whether the host device is advertising the gateway IP address (true) or not (false)"
            },
            "is_l3": {
              "type": "boolean",
              "description": "Indicates whether the session is part of a layer 3 configuration (true) or not (false)"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "Field": {
          "type": "object",
          "required": [
            "aliases",
            "defaultValue",
            "doc",
            "jsonProps",
            "name",
            "objectProps",
            "order",
            "props",
            "schema"
          ],
          "properties": {
            "props": {
              "type": "object",
              "additionalProperties": {
                "type": "string"
              }
            },
            "name": {
              "type": "string"
            },
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "doc": {
              "type": "string"
            },
            "defaultValue": {
              "$ref": "#/definitions/JsonNode"
            },
            "order": {
              "type": "string",
              "enum": [
                "ASCENDING",
                "DESCENDING",
                "IGNORE"
              ]
            },
            "aliases": {
              "type": "array",
              "uniqueItems": true,
              "items": {
                "type": "string"
              }
            },
            "jsonProps": {
              "type": "object",
              "additionalProperties": {
                "$ref": "#/definitions/JsonNode"
              }
            },
            "objectProps": {
              "type": "object",
              "additionalProperties": {
                "type": "object",
                "properties": {}
              }
            }
          }
        },
        "Interface": {
          "description": "This model contains descriptions of the data collected and returned by the Interface endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "type": {
              "type": "string",
              "description": "Identifier of the kind of interface. Values include bond, bridge, eth, loopback, macvlan, swp, vlan, vrf, and vxlan."
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the data was collected"
            },
            "last_changed": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the interface was started, deleted, updated or marked as dead (device is down)"
            },
            "ifname": {
              "type": "string",
              "description": "Name of the interface"
            },
            "state": {
              "type": "string",
              "description": "Indicates whether the interface is up or down"
            },
            "vrf": {
              "type": "string",
              "description": "Name of the virtual route forwarding (VRF) interface, if present"
            },
            "details": {
              "type": "string",
              "description": ""
            }
          }
        },
        "InventoryModel": {
          "type": "object",
          "required": [
            "label",
            "value"
          ],
          "properties": {
            "label": {
              "type": "string"
            },
            "value": {
              "type": "integer",
              "format": "int32"
            }
          }
        },
        "InventoryOutput": {
          "type": "object",
          "properties": {
            "data": {
              "$ref": "#/definitions/InventorySampleClass"
            }
          }
        },
        "InventorySampleClass": {
          "type": "object",
          "properties": {
            "total": {
              "type": "integer",
              "format": "int32",
              "example": 100,
              "description": "total number of devices"
            },
            "os_version": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "os_vendor": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "asic": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "asic_vendor": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "asic_model": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "cl_license": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "agent_version": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "agent_state": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "platform": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "platform_vendor": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "disk_size": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "memory_size": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "platform_model": {
              "$ref": "#/definitions/InventorySuperModel"
            },
            "interface_speeds": {
              "$ref": "#/definitions/InventorySuperModel"
            }
          }
        },
        "InventorySuperModel": {
          "type": "object",
          "required": [
            "data",
            "label"
          ],
          "properties": {
            "label": {
              "type": "string"
            },
            "data": {
              "type": "array",
              "items": {
                "$ref": "#/definitions/InventoryModel"
              }
            }
          }
        },
        "IteratorEntryStringJsonNode": {
          "type": "object"
        },
        "IteratorJsonNode": {
          "type": "object"
        },
        "IteratorString": {
          "type": "object"
        },
        "JsonNode": {
          "type": "object",
          "required": [
            "array",
            "bigDecimal",
            "bigInteger",
            "bigIntegerValue",
            "binary",
            "binaryValue",
            "boolean",
            "booleanValue",
            "containerNode",
            "decimalValue",
            "double",
            "doubleValue",
            "elements",
            "fieldNames",
            "fields",
            "floatingPointNumber",
            "int",
            "intValue",
            "integralNumber",
            "long",
            "longValue",
            "missingNode",
            "null",
            "number",
            "numberType",
            "numberValue",
            "object",
            "pojo",
            "textValue",
            "textual",
            "valueAsBoolean",
            "valueAsDouble",
            "valueAsInt",
            "valueAsLong",
            "valueAsText",
            "valueNode"
          ],
          "properties": {
            "elements": {
              "$ref": "#/definitions/IteratorJsonNode"
            },
            "fieldNames": {
              "$ref": "#/definitions/IteratorString"
            },
            "binary": {
              "type": "boolean"
            },
            "intValue": {
              "type": "integer",
              "format": "int32"
            },
            "object": {
              "type": "boolean"
            },
            "int": {
              "type": "boolean"
            },
            "long": {
              "type": "boolean"
            },
            "double": {
              "type": "boolean"
            },
            "bigDecimal": {
              "type": "boolean"
            },
            "bigInteger": {
              "type": "boolean"
            },
            "textual": {
              "type": "boolean"
            },
            "boolean": {
              "type": "boolean"
            },
            "valueNode": {
              "type": "boolean"
            },
            "containerNode": {
              "type": "boolean"
            },
            "missingNode": {
              "type": "boolean"
            },
            "pojo": {
              "type": "boolean"
            },
            "number": {
              "type": "boolean"
            },
            "integralNumber": {
              "type": "boolean"
            },
            "floatingPointNumber": {
              "type": "boolean"
            },
            "numberValue": {
              "$ref": "#/definitions/Number"
            },
            "numberType": {
              "type": "string",
              "enum": [
                "INT",
                "LONG",
                "BIG_INTEGER",
                "FLOAT",
                "DOUBLE",
                "BIG_DECIMAL"
              ]
            },
            "longValue": {
              "type": "integer",
              "format": "int64"
            },
            "bigIntegerValue": {
              "type": "integer"
            },
            "doubleValue": {
              "type": "number",
              "format": "double"
            },
            "decimalValue": {
              "type": "number"
            },
            "booleanValue": {
              "type": "boolean"
            },
            "binaryValue": {
              "type": "array",
              "items": {
                "type": "string",
                "format": "byte",
                "pattern": "^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$"
              }
            },
            "valueAsInt": {
              "type": "integer",
              "format": "int32"
            },
            "valueAsLong": {
              "type": "integer",
              "format": "int64"
            },
            "valueAsDouble": {
              "type": "number",
              "format": "double"
            },
            "valueAsBoolean": {
              "type": "boolean"
            },
            "textValue": {
              "type": "string"
            },
            "valueAsText": {
              "type": "string"
            },
            "array": {
              "type": "boolean"
            },
            "fields": {
              "$ref": "#/definitions/IteratorEntryStringJsonNode"
            },
            "null": {
              "type": "boolean"
            }
          }
        },
        "LLDP": {
          "description": "This model contains descriptions of the data collected and returned by the LLDP endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for the host device"
            },
            "ifname": {
              "type": "string",
              "description": "Name of the host interface where the LLDP service is running"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time that the session was started, deleted, updated, or marked dead (device is down)"
            },
            "peer_hostname": {
              "type": "string",
              "description": "User-defined name for the peer device"
            },
            "peer_ifname": {
              "type": "string",
              "description": "Name of the peer interface where the session is running"
            },
            "lldp_peer_bridge": {
              "type": "boolean",
              "description": "Indicates whether the peer device is a bridge (true) or not (false)"
            },
            "lldp_peer_router": {
              "type": "boolean",
              "description": "Indicates whether the peer device is a router (true) or not (false)"
            },
            "lldp_peer_station": {
              "type": "boolean",
              "description": "Indicates whether the peer device is a station (true) or not (false)"
            },
            "lldp_peer_os": {
              "type": "string",
              "description": "Operating system (OS) used by peer device. Values include Cumulus Linux, RedHat, Ubuntu, and CentOS."
            },
            "lldp_peer_osv": {
              "type": "string",
              "description": "Version of the OS used by peer device. Example values include 3.7.3, 2.5.x, 16.04, 7.1."
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "LogicalType": {
          "type": "object",
          "required": [
            "name"
          ],
          "properties": {
            "name": {
              "type": "string"
            }
          }
        },
        "LoginRequest": {
          "description": "User-entered credentials used to validate if user is allowed to access NetQ",
          "type": "object",
          "required": [
            "password",
            "username"
          ],
          "properties": {
            "username": {
              "type": "string"
            },
            "password": {
              "type": "string"
            }
          }
        },
        "LoginResponse": {
          "description": "Response to user login request",
          "type": "object",
          "required": [
            "id"
          ],
          "properties": {
            "terms_of_use_accepted": {
              "type": "boolean",
              "description": "Indicates whether user has accepted the terms of use"
            },
            "access_token": {
              "type": "string",
              "description": "Grants jason web token (jwt) access token. The access token also contains the NetQ Platform or Appliance (opid) which the user is permitted to access. By default, it is the primary opid given by the user."
            },
            "expires_at": {
              "type": "integer",
              "format": "int64",
              "description": "Number of hours the access token is valid before it automatically expires, epoch miliseconds. By default, tokens are valid for 24 hours."
            },
            "id": {
              "type": "string"
            },
            "premises": {
              "type": "array",
              "description": "List of premises that this user is authorized to view",
              "items": {
                "$ref": "#/definitions/Premises"
              }
            },
            "customer_id": {
              "type": "integer",
              "format": "int32",
              "description": "customer id of this user"
            }
          }
        },
        "MacFdb": {
          "description": "This model contains descriptions of the data collected and returned by the MacFdb endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "mac_address": {
              "type": "string",
              "description": "Media access control address for a device reachable via the local bridge member port 'nexthop' or via remote VTEP with IP address of 'dst'"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "dst": {
              "type": "string",
              "description": "IP address of a remote VTEP from which this MAC address is reachable"
            },
            "nexthop": {
              "type": "string",
              "description": "Interface where the MAC address can be reached"
            },
            "is_remote": {
              "type": "boolean",
              "description": "Indicates if the MAC address is reachable locally on 'nexthop' (false) or remotely via a VTEP with address 'dst' (true)"
            },
            "port": {
              "type": "string",
              "description": "Currently unused"
            },
            "vlan": {
              "type": "integer",
              "format": "int32",
              "description": "Name of associated VLAN"
            },
            "is_static": {
              "type": "boolean",
              "description": "Indicates if the MAC address is a static address (true) or dynamic address (false)"
            },
            "origin": {
              "type": "boolean",
              "description": "Indicates whether the MAC address is one of the host's interface addresses (true) or not (false)"
            },
            "active": {
              "type": "boolean",
              "description": "Currently unused"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "MstpInfo": {
          "description": "This model contains descriptions of the data collected and returned by the MSTP endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "bridge_name": {
              "type": "string",
              "description": "User-defined name for a bridge"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "state": {
              "type": "boolean",
              "description": "Indicates whether MSTP is enabled (true) or not (false)"
            },
            "root_port_name": {
              "type": "string",
              "description": "Name of the physical interface (port) that provides the minimum cost path from the Bridge to the MSTI Regional Root"
            },
            "root_bridge": {
              "type": "string",
              "description": "Name of the CIST root for the bridged LAN"
            },
            "topo_chg_ports": {
              "type": "string",
              "description": "Names of ports that were part of the last topology change event"
            },
            "time_since_tcn": {
              "type": "integer",
              "format": "int64",
              "description": "Amount of time, in seconds, since the last topology change notification"
            },
            "topo_chg_cntr": {
              "type": "integer",
              "format": "int64",
              "description": "Number of times topology change notifications have been sent"
            },
            "bridge_id": {
              "type": "string",
              "description": "Spanning Tree bridge identifier for current host"
            },
            "edge_ports": {
              "type": "string",
              "description": "List of port names that are Spanning Tree edge ports"
            },
            "network_ports": {
              "type": "string",
              "description": "List of port names that are Spanning Tree network ports"
            },
            "disputed_ports": {
              "type": "string",
              "description": "List of port names that are in Spanning Tree dispute state"
            },
            "bpduguard_ports": {
              "type": "string",
              "description": "List of port names where BPDU Guard is enabled"
            },
            "bpduguard_err_ports": {
              "type": "string",
              "description": "List of port names where BPDU Guard violation occurred"
            },
            "ba_inconsistent_ports": {
              "type": "string",
              "description": "List of port names where Spanning Tree Bridge Assurance is failing"
            },
            "bpdufilter_ports": {
              "type": "string",
              "description": "List of port names where Spanning Tree BPDU Filter is enabled"
            },
            "ports": {
              "type": "string",
              "description": "List of port names in the Spanning Tree instance"
            },
            "is_vlan_filtering": {
              "type": "boolean",
              "description": "Indicates whether the bridge is enabled with VLAN filtering (is VLAN-aware) (true) or not (false)"
            }
          }
        },
        "Neighbor": {
          "description": "This model contains descriptions of the data collected and returned by the Neighbor endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name of a device"
            },
            "ifname": {
              "type": "string",
              "description": "User-defined name of an software interface on a device"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time when data was collected"
            },
            "vrf": {
              "type": "string",
              "description": "Name of virtual route forwarding (VRF) interface, when applicable"
            },
            "is_remote": {
              "type": "boolean",
              "description": "Indicates if the neighbor is reachable through a local interface (false) or remotely (true)"
            },
            "ifindex": {
              "type": "integer",
              "format": "int32",
              "description": "IP address index for the neighbor device"
            },
            "mac_address": {
              "type": "string",
              "description": "MAC address for the neighbor device"
            },
            "is_ipv6": {
              "type": "boolean",
              "description": "Indicates whether the neighbor's IP address is version six (IPv6) (true) or version four (IPv4) (false)"
            },
            "message_type": {
              "type": "string",
              "description": "Network protocol or service identifier used in neighbor-related events. Value is neighbor."
            },
            "ip_address": {
              "type": "string",
              "description": "IPv4 or IPv6 address for the neighbor device"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "NODE": {
          "description": "This model contains descriptions of the data collected and returned by the Node endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name of the device"
            },
            "sys_uptime": {
              "type": "integer",
              "format": "int64",
              "description": "Amount of time this device has been powered up"
            },
            "lastboot": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time this device was last booted"
            },
            "last_reinit": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time this device was last initialized"
            },
            "active": {
              "type": "boolean",
              "description": "Indicates whether this device is active (true) or not (false)"
            },
            "version": {
              "type": "string",
              "description": ""
            },
            "ntp_state": {
              "type": "string",
              "description": "Status of the NTP service running on this device; in sync, not in sync, or unknown"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "last_update_time": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the device was last updated"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "NTP": {
          "description": "This model contains descriptions of the data collected and returned by the NTP endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name of device running NTP service"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "ntp_sync": {
              "type": "string",
              "description": "Status of the NTP service running on this device; in sync, not in sync, or unknown"
            },
            "stratum": {
              "type": "integer",
              "format": "int32",
              "description": ""
            },
            "ntp_app": {
              "type": "string",
              "description": "Name of the NTP service
            },
            "message_type": {
              "type": "string",
              "description": "Network protocol or service identifier used in NTP-related events. Value is ntp."
            },
            "current_server": {
              "type": "string",
              "description": "Name or address of server providing time synchronization"
            },
            "active": {
              "type": "boolean",
              "description": "Indicates whether NTP service is running (true) or not (false)"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "Number": {
          "type": "object",
          "description": " "
        },
        "Port": {
          "description": "This model contains descriptions of the data collected and returned by the Port endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for the device with this port"
            },
            "ifname": {
              "type": "string",
              "description": "User-defined name for the software interface on this port"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "speed": {
              "type": "string",
              "description": "Maximum rating for port. Examples include 10G, 25G, 40G, unknown."
            },
            "identifier": {
              "type": "string",
              "description": "Identifies type of port module if installed. Example values include empty, QSFP+, SFP, RJ45"
            },
            "autoneg": {
              "type": "string",
              "description": "Indicates status of the auto-negotiation feature. Values include on and off."
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "transreceiver": {
              "type": "string",
              "description": "Name of installed transceiver. Example values include 40G Base-CR4, 10Gtek."
            },
            "connector": {
              "type": "string",
              "description": "Name of installed connector. Example values include LC, copper pigtail, RJ-45, n/a."
            },
            "vendor_name": {
              "type": "string",
              "description": "Name of the port vendor. Example values include OEM, Mellanox, Amphenol, Finisar, Fiberstore, n/a."
            },
            "part_number": {
              "type": "string",
              "description": "Manufacturer part number"
            },
            "serial_number": {
              "type": "string",
              "description": "Manufacturer serial number"
            },
            "length": {
              "type": "string",
              "description": "Length of cable connected. Example values include 1m, 2m, n/a."
            },
            "supported_fec": {
              "type": "string",
              "description": "List of forward error correction (FEC) algorithms supported on this port. Example values include BaseR, RS, Not reported, None."
            },
            "advertised_fec": {
              "type": "string",
              "description": "Type of FEC advertised by this port"
            },
            "fec": {
              "type": "string",
              "description": "Forward error correction"
            },
            "message_type": {
              "type": "string",
              "description": "Network protocol or service identifier used in port-related events. Value is port."
            },
            "state": {
              "type": "string",
              "description": "Status of the port, either up or down."
            }
          }
        },
        "Premises": {
          "type": "object",
          "required": [
            "name",
            "opid"
          ],
          "properties": {
            "opid": {
              "type": "integer",
              "format": "int32"
            },
            "name": {
              "type": "string"
            }
          },
          "description": "Premises"
        },
        "Route": {
          "description": "This module contains descirptions of the data collected and returned by the Route endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "vrf": {
              "type": "string",
              "description": "Name of associated virtual route forwarding (VRF) interface, if applicable"
            },
            "message_type": {
              "type": "string",
              "description": "Network protocol or service identifier used in route-related events. Value is route."
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "is_ipv6": {
              "type": "boolean",
              "description": "Indicates whether the IP address for this route is an IPv6 address (true) or an IPv4 address (false)"
            },
            "rt_table_id": {
              "type": "integer",
              "format": "int32",
              "description": "Routing table identifier for this route"
            },
            "src": {
              "type": "string",
              "description": "Hostname of device where this route originated"
            },
            "nexthops": {
              "type": "string",
              "description": "List of hops remaining to reach destination"
            },
            "route_type": {
              "type": "integer",
              "format": "int32",
              "description": ""
            },
            "origin": {
              "type": "boolean",
              "description": "Indicates whether the source of this route is on the  device indicated by 'hostname'"
            },
            "protocol": {
              "type": "string",
              "description": "Protocol used for routing. Example values include BGP, OSPF."
            },
            "prefix": {
              "type": "string",
              "description": "Address prefix for this route"
            }
          }
        },
        "Schema": {
          "type": "object",
          "required": [
            "aliases",
            "doc",
            "elementType",
            "enumSymbols",
            "error",
            "fields",
            "fixedSize",
            "fullName",
            "hashCode",
            "jsonProps",
            "logicalType",
            "name",
            "namespace",
            "objectProps",
            "props",
            "type",
            "types",
            "valueType"
          ],
          "properties": {
            "props": {
              "type": "object",
              "additionalProperties": {
                "type": "string"
              }
            },
            "type": {
              "type": "string",
              "enum": [
                "RECORD",
                "ENUM",
                "ARRAY",
                "MAP",
                "UNION",
                "FIXED",
                "STRING",
                "BYTES",
                "INT",
                "LONG",
                "FLOAT",
                "DOUBLE",
                "BOOLEAN",
                "NULL"
              ]
            },
            "logicalType": {
              "$ref": "#/definitions/LogicalType"
            },
            "hashCode": {
              "type": "integer",
              "format": "int32"
            },
            "elementType": {
              "$ref": "#/definitions/Schema"
            },
            "aliases": {
              "type": "array",
              "uniqueItems": true,
              "items": {
                "type": "string"
              }
            },
            "namespace": {
              "type": "string"
            },
            "fields": {
              "type": "array",
              "items": {
                "$ref": "#/definitions/Field"
              }
            },
            "types": {
              "type": "array",
              "items": {
                "$ref": "#/definitions/Schema"
              }
            },
            "fullName": {
              "type": "string"
            },
            "enumSymbols": {
              "type": "array",
              "items": {
                "type": "string"
              }
            },
            "doc": {
              "type": "string"
            },
            "valueType": {
              "$ref": "#/definitions/Schema"
            },
            "fixedSize": {
              "type": "integer",
              "format": "int32"
            },
            "name": {
              "type": "string"
            },
            "error": {
              "type": "boolean"
            },
            "jsonProps": {
              "type": "object",
              "additionalProperties": {
                "$ref": "#/definitions/JsonNode"
              }
            },
            "objectProps": {
              "type": "object",
              "additionalProperties": {
                "type": "object",
                "properties": {}
              }
            }
          }
        },
        "Sensor": {
          "description": "This model contains descriptions of the data collected and returned from the Sensor endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name of the device where the sensor resides"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "s_prev_state": {
              "type": "string",
              "description": "Previous state of a fan or power supply unit (PSU) sensor. Values include OK, absent, and bad."
            },
            "s_name": {
              "type": "string",
              "description": "Type of sensor. Values include fan, psu, temp."
            },
            "s_state": {
              "type": "string",
              "description": "Current state of a fan or power supply unit (PSU) sensor. Values include OK, absent, and bad."
            },
            "s_input": {
              "type": "number",
              "format": "float",
              "description": "Sensor input"
            },
            "message_type": {
              "type": "string",
              "description": "Network protocol or service identifier used in sensor-related events. Value is sensor."
            },
            "s_msg": {
              "type": "string",
              "description": "Sensor message"
            },
            "s_desc": {
              "type": "string",
              "description": "User-defined name of sensor. Example values include fan1, fan-2, psu1, psu02, psu1temp1, temp2."
            },
            "s_max": {
              "type": "integer",
              "format": "int32",
              "description": "Current maximum temperature threshold value"
            },
            "s_min": {
              "type": "integer",
              "format": "int32",
              "description": "Current minimum temperature threshold value"
            },
            "s_crit": {
              "type": "integer",
              "format": "int32",
              "description": "Current critical high temperature threshold value"
            },
            "s_lcrit": {
              "type": "integer",
              "format": "int32",
              "description": "Current critical low temperature threshold value"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "active": {
              "type": "boolean",
              "description": "Indicates whether the identified sensor is operating (true) or not (false)"
            },
            "deleted": {
              "type": "boolean",
              "description": "Indicates whether the sensor has been deleted (true) or not (false)"
            }
          }
        },
        "Services": {
          "description": "This model contains descriptions of the data collected and returned from the Sensor endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name of the device where the network services are running."
            },
            "name": {
              "type": "string",
              "description": "Name of the service; for example, BGP, OSPF, LLDP, NTP, and so forth."
            },
            "vrf": {
              "type": "string",
              "description": "Name of the Virtual Route Forwarding (VRF) interface if employed."
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "is_enabled": {
              "type": "boolean",
              "description": "Indicates whether the network service is enabled."
            },
            "is_active": {
              "type": "boolean",
              "description": "Indicates whether the network service is currently active."
            },
            "is_monitored": {
              "type": "boolean",
              "description": "Indicates whether the network service is currently being monitored."
            },
            "status": {
              "type": "integer",
              "format": "int32",
              "description": "Status of the network service connection; up or down."
            },
            "start_time": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time that the network service was most recently started."
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            }
          }
        },
        "Vlan": {
          "description": "This model contains descriptions of the data collected and returned by the VLAN endpoint.",
          "type": "object",
          "required": [
            "schema"
          ],
          "properties": {
            "schema": {
              "$ref": "#/definitions/Schema"
            },
            "opid": {
              "type": "integer",
              "format": "int32",
              "description": "Internal use only"
            },
            "hostname": {
              "type": "string",
              "description": "User-defined name for a device"
            },
            "ifname": {
              "type": "string",
              "description": "User-defined name for a software interface"
            },
            "timestamp": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time data was collected"
            },
            "last_changed": {
              "type": "integer",
              "format": "int64",
              "description": "Date and time the VLAN configuration was changed (updated, deleted)"
            },
            "vlans": {
              "type": "string",
              "description": "List of other VLANs known to this VLAN or on this device"
            },
            "svi": {
              "type": "string",
              "description": "Switch virtual interface (SVI) associated with this VLAN"
            },
            "db_state": {
              "type": "string",
              "description": "Internal use only"
            },
            "ports": {
              "type": "string",
              "description": "Names of ports on the device associated with this VLAN"
            }
          }
        }
      }
    }
    
    

    Cumulus NetQ UI User Guide

    This guide is intended for network administrators and operators who are responsible for monitoring and troubleshooting the network in their data center environment. NetQ 2.x offers the ability to easily monitor and manage your data center network infrastructure and operational health. This guide provides instructions and information about monitoring individual components of the network, the network as a whole, and the NetQ software itself using the NetQ graphical user interface (GUI). If you prefer to use a command line interface, refer to the Cumulus NetQ CLI User Guide.

    NetQ User Interface Overview

    The NetQ 2.x graphical user interface (UI) enables you to access NetQ capabilities through a web browser as opposed to through a terminal window using the Command Line Interface (CLI). Visual representations of the health of the network, inventory, and system events make it easy to both find faults and misconfigurations, and to fix them.

    The UI is accessible in both on-site and in-cloud deployments. It is supported on Google Chrome. Other popular browsers may be used, but have not been tested and may have some presentation issues.

    Before you get started, you should refer to the release notes for this version.

    Access the NetQ UI

    Logging in to the NetQ UI is as easy as opening any web page.

    To log in to the UI:

    1. Open a new Internet browser window or tab.

    2. Enter the following URL into the Address bar for the NetQ On-premises Appliance or VM, or the NetQ Cloud Appliance or VM:

    3. Login.

      Default usernames and passwords for UI access:

      • NetQ On-premises: admin, admin
      • NetQ Cloud: Use credentials provided by Cumulus via email titled Welcome to Cumulus NetQ!
    1. Enter your username.

    2. Enter your password.

    3. Enter a new password.

    4. Enter the new password again to confirm it.

    5. Click Update and Accept after reading the Terms of Use.

      The default Cumulus Workbench opens, with your username shown in the upper right corner of the application.

    1. Enter your username.

    2. Enter your password.

      The user-specified home workbench is displayed. If a home workbench is not specified, then the Cumulus Default workbench is displayed.

    Any workbench can be set as the home workbench. Click (User Settings), click Profiles and Preferences, then on the Workbenches card click to the left of the workbench name you want to be your home workbench.

    To log out of the UI:

    1. Click at the top right of the application.

    2. Select Log Out.

    Application Layout

    The NetQ UI contains two main areas:

    Found in the application header, click to open the main menu which provides navigation to:

    Recent Actions

    Found in the header, Recent Actions keeps track of every action you take on your workbench and then saves each action with a timestamp. This enables you to go back to a previous state or repeat an action.

    To open Recent Actions, click . Click on any of the actions to perform that action again.

    The Global Search field in the UI header enables you to search for devices. It behaves like most searches and can help you quickly find device information. For more detail on creating and running searches, refer to Create and Run Searches.

    Clicking on the Cumulus logo takes you to your favorite workbench. For details about specifying your favorite workbench, refer to Set User Preferences.

    Quick Network Health View

    Found in the header, the graph and performance rating provide a view into the health of your network at a glance.

    On initial start up of the application, it may take up to an hour to reach an accurate health indication as some processes run every 30 minutes.

    Workbenches

    A workbench is comprised of a given set of cards. A pre-configured default workbench, Cumulus Workbench, is available to get you started. It contains Device Inventory, Switch Inventory, Alarm and Info Events, and Network Health cards. On initial login, this workbench is opened. You can create your own workbenches and add or remove cards to meet your particular needs. For more detail about managing your data using workbenches, refer to Focus Your Monitoring Using Workbenches.

    Cards

    Cards present information about your network for monitoring and troubleshooting. This is where you can expect to spend most of your time. Each card describes a particular aspect of the network. Cards are available in multiple sizes, from small to full screen. The level of the content on a card varies in accordance with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen view. Cards are collected onto a workbench where you see all of the data relevant to a task or set of tasks. You can add and remove cards from a workbench, move between cards and card sizes, and make copies of cards to show different levels of data at the same time. For details about working with cards, refer to Access Data with Cards.

    User Settings

    Each user can customize the NetQ application display, change their account password, and manage their workbenches. This is all performed from User Settings > Profile & Preferences. For details, refer to Set User Preferences.

    Embedded Application Help

    The NetQ UI provides guided walk-throughs for selected tasks and links to additional resources.

    You must have connection to the Internet to access this feature.

    Click Need Help? to open the menu of tasks and resources currently available.

    Is the help button covering content that you want to see? The button can be dragged and dropped to various locations around the edge of the UI, so if you do not like it on the bottom left (default), you can move it to the bottom center, bottom right, right side bottom, etc. A green dashed border appears in the locations where it can be placed. Alternately, enlarge the NetQ UI application window or scroll within the window to view any hidden content.

    Within the help menu, topics are grouped by categories:

    You can search for help items by collapsing and expanding categories or by searching. Click a category title to toggle between viewing and hiding the content. To search, begin entering text into the Search field to see suggested content.

    Format Cues

    Color is used to indicate links, options, and status within the UI.

    ItemColor
    Hover on itemBlue
    Clickable itemBlack
    Selected itemGreen
    Highlighted itemBlue
    LinkBlue
    Good/Successful resultsGreen
    Result with critical severity eventPink
    Result with high severity eventRed
    Result with medium severity eventOrange
    Result with low severity eventYellow

    Create and Run Searches

    The Global Search field in the UI header enables you to search for devices or cards. You can create new searches or run existing searches.

    As with most search fields, simply begin entering the criteria in the search field. As you type, items that match the search criteria are shown in the search history dropdown along with the last time the search was viewed. Wildcards are not allowed, but this predictive matching eliminates the need for them. By default, the most recent searches are shown. If more have been performed, they can be accessed. This may provide a quicker search by reducing entry specifics and suggesting recent searches. Selecting a suggested search from the list provides a preview of the search results to the right.

    To create a new search:

    1. Click in the Global Search field.

    2. Enter your search criteria.

    3. Click the device hostname or card workflow in the search list to open the associated information.

      If you have more matches than fit in the window, click the See All # Results link to view all found matches. The count represents the number of devices found. It does not include cards found.

    You can re-run a recent search, saving time if you are comparing data from two or more devices.

    To re-run a recent search:

    1. Click in the Global Search field.

    2. When the desired search appears in the suggested searches list, select it.

      You may need to click See All # Results to find the desired search. If you do not find it in the list, you may still be able to find it in the Recent Actions list.

    Focus Your Monitoring Using Workbenches

    Workbenches are an integral structure of the Cumulus NetQ application. They are where you collect and view the data that is important to you.

    There are two types of workbenches:

    Both types of workbenches display a set of cards. Default workbenches are public (available for viewing by all users), whereas Custom workbenches are private (only viewable by user who created them).

    Default Workbenches

    In this release, only one default workbench is available, the Cumulus Workbench, to get you started. It contains Device Inventory, Switch Inventory, Alarm and Info Events, and Network Health cards, giving you a high-level view of how your network is operating.

    On initial login, the Cumulus Workbench is opened. On subsequent logins, the last workbench you had displayed is opened.

    Custom Workbenches

    Users with either administrative or user roles can create and save as many custom workbenches as suits their needs. For example, a user might create a workbench that:

    Create a Workbench

    To create a workbench:

    1. Click in the workbench header.

    2. Enter a name for the workbench.

    3. Click Create to open a blank new workbench, or Cancel to discard the workbench.

    4. Add cards to the workbench using or .

    Refer to Access Data with Cards for information about interacting with cards on your workbenches.

    Remove a Workbench

    Once you have created a number of custom workbenches, you might find that you no longer need some of them. As an administrative user, you can remove any workbench, except for the default Cumulus Workbench. Users with a user role can only remove workbenches they have created.

    To remove a workbench:

    1. Click in the application header to open the User Settings options.

    2. Click Profile & Preferences.

    3. Locate the Workbenches card.

    4. Hover over the workbench you want to remove, and click Delete.

    Open an Existing Workbench

    There are several options for opening workbenches:

    Manage Auto-Refresh for Your Workbenches

    With NetQ 2.3.1 and later, you can specify how often to update the data displayed on your workbenches. Three refresh rates are available:

    By default, auto-refresh is enabled and configured to update every 30 seconds.

    Disable/Enable Auto-Refresh

    To disable or pause auto-refresh of your workbenches, simply click the Refresh icon. This toggles between the two states, Running and Paused, where indicates it is currently disabled and indicates it is currently enabled.

    While having the workbenches update regularly is good most of the time, you may find that you want to pause the auto-refresh feature when you are troubleshooting and you do not want the data to change on a given set of cards temporarily. In this case, you can disable the auto-refresh and then enable it again when you are finished.

    View Current Settings

    To view the current auto-refresh rate and operational status, hover over the Refresh icon on a workbench header, to open the tool tip as follows:

    Change Settings

    To modify the auto-refresh setting:

    1. Click on the Refresh icon.

    2. Select the refresh rate you want. The refresh rate is applied immediately. A check mark is shown next to the current selection.

    Manage Workbenches

    To manage your workbenches as a group, either:

    Both of these open the Profiles & Preferences page. Look for the Workbenches card and refer to Manage Your Workbenches for more information.

    Access Data with Cards

    Cards present information about your network for monitoring and troubleshooting. This is where you can expect to spend most of your time. Each card describes a particular aspect of the network. Cards are available in multiple sizes, from small to full screen. The level of the content on a card varies in accordance with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen card. Cards are collected onto a workbench where you see all of the data relevant to a task or set of tasks. You can add and remove cards from a workbench, move between cards and card sizes, change the time period of the data shown on a card, and make copies of cards to show different levels of data at the same time.

    Card Sizes

    The various sizes of cards enables you to view your content at just the right level. For each aspect that you are monitoring there is typically a single card, that presents increasing amounts of data over its four sizes. For example, a snapshot of your total inventory may be sufficient, but to monitor the distribution of hardware vendors may requires a bit more space.

    Small Cards

    Small cards are most effective at providing a quick view of the performance or statistical value of a given aspect of your network. They are commonly comprised of an icon to identify the aspect being monitored, summary performance or statistics in the form of a graph and/or counts, and often an indication of any related events. Other content items may be present. Some examples include a Devices Inventory card, a Switch Inventory card, an Alarm Events card, an Info Events card, and a Network Health card, as shown here:

    Medium Cards

    Medium cards are most effective at providing the key measurements for a given aspect of your network. They are commonly comprised of an icon to identify the aspect being monitored, one or more key measurements that make up the overall performance. Often additional information is also included, such as related events or components. Some examples include a Devices Inventory card, a Switch Inventory card, an Alarm Events card, an Info Events card, and a Network Health card, as shown here. Compare these with their related small- and large-sized cards.

    Large Cards

    Large cards are most effective at providing the detailed information for monitoring specific components or functions of a given aspect of your network. These can aid in isolating and resolving existing issues or preventing potential issues. They are commonly comprised of detailed statistics and graphics. Some large cards also have tabs for additional detail about a given statistic or other related information. Some examples include a Devices Inventory card, an Alarm Events card, and a Network Health card, as shown here. Compare these with their related small- and medium-sized cards.

    Full-Screen Cards

    Full-screen cards are most effective for viewing all available data about an aspect of your network all in one place. When you cannot find what you need in the small, medium, or large cards, it is likely on the full-screen card. Most full-screen cards display data in a grid, or table; however, some contain visualizations. Some examples include All Events card and All Switches card, as shown here.

    Card Size Summary

    Card Size

    Small

    Medium

    Large

    Full Screen

    Primary Purpose

    • Quick view of status, typically at the level of good or bad

    • Enable quick actions, run a validation or trace for example

    • View key performance parameters or statistics

    • Perform an action

    • Look for potential issues

    • View detailed performance and statistics

    • Perform actions

    • Compare and review related information

    • View all attributes for given network aspect

    • Free-form data analysis and visualization

    • Export data to third-party tools

    Card Workflows

    The UI provides a number of card workflows. Card workflows focus on a particular aspect of your network and are a linked set of each size card-a small card, a medium card, one or more large cards, and one or more full screen cards. The following card workflows are available:

    Access a Card Workflow

    You can access a card workflow in multiple ways:

    If you have multiple cards open on your workbench already, you might need to scroll down to see the card you have just added.

    To open the card workflow through an existing workbench:

    1. Click in the workbench task bar.

    2. Select the relevant workbench.

      The workbench opens, hiding your previous workbench.

    To open the card workflow from Recent Actions:

    1. Click in the application header.

    2. Look for an “Add: <card name>” item.

    3. If it is still available, click the item.

      The card appears on the current workbench, at the bottom.

    To access the card workflow by adding the card:

    1. Click in the workbench task bar.

    2. Follow the instructions in Add Cards to Your Workbench or Add Switch Cards to Your Workbench.

      The card appears on the current workbench, at the bottom.

    To access the card workflow by searching for the card:

    1. Click in the Global Search field.

    2. Begin typing the name of the card.

    3. Select it from the list.

      The card appears on a current workbench, at the bottom.

    Card Interactions

    Every card contains a standard set of interactions, including the ability to switch between card sizes, and change the time period of the presented data. Most cards also have additional actions that can be taken, in the form of links to other cards, scrolling, and so forth. The four sizes of cards for a particular aspect of the network are connected into a flow; however, you can have duplicate cards displayed at the different sizes. Cards with tabular data provide filtering, sorting, and export of data. The medium and large cards have descriptive text on the back of the cards.

    To access the time period, card size, and additional actions, hover over the card. These options appear, covering the card header, enabling you to select the desired option.

    Add Cards to Your Workbench

    You can add one or more cards to a workbench at any time. To add Devices|Switches cards, refer to Add Switch Cards to Your Workbench. For all other cards, follow the steps in this section.

    To add one or more cards:

    1. Click to open the Cards modal.

    2. Scroll down until you find the card you want to add, select the category of cards, or use Search to find the card you want to add.

      This example uses the category tab to narrow the search for a card.

    3. Click on each card you want to add.

      As you select each card, it is grayed out and a appears on top of it. If you have selected one or more cards using the category option, you can selected another category without losing your current selection. Note that the total number of cards selected for addition to your workbench is noted at the bottom.

      Also note that if you change your mind and do not want to add a particular card you have selected, simply click on it again to remove it from the cards to be added. Note the total number of cards selected decreases with each card you remove.

    4. When you have selected all of the cards you want to add to your workbench, you can confirm which cards have been selected by clicking the Cards Selected link. Modify your selection as needed.

    5. Click Open Cards to add the selected cards, or Cancel to return to your workbench without adding any cards.

    The cards are placed at the end of the set of cards currently on the workbench. You might need to scroll down to see them. By default, the medium size of the card is added to your workbench for all except the Validation and Trace cards. These are added in the large size by default. You can rearrange the cards as described in Reposition a Card on Your Workbench.

    Add Switch Cards to Your Workbench

    You can add switch cards to a workbench at any time. For all other cards, follow the steps in Add Cards to Your Workbench. You can either add the card through the Switches icon on a workbench header or by searching for it through Global Search.

    To add a switch card using the icon:

    1. Click to open the Add Switch Card modal.

    2. Begin entering the hostname of the switch you want to monitor.

    3. Select the device from the suggestions that appear.

      If you attempt to enter a hostname that is unknown to NetQ, a pink border appears around the entry field and you are unable to select Add. Try checking for spelling errors. If you feel your entry is valid, but not an available choice, consult with your network administrator.

    4. Optionally select the small or large size to display instead of the medium size.

    5. Click Add to add the switch card to your workbench, or Cancel to return to your workbench without adding the switch card.

    To open the switch card by searching:

    1. Click in Global Search.

    2. Begin typing the name of a switch.

    3. Select it from the options that appear.

    Remove Cards from Your Workbench

    Removing cards is handled one card at a time.

    To remove a card:

    1. Hover over the card you want to remove.

    2. Click (More Actions menu).

    3. Click Remove.

    The card is removed from the workbench, but not from the application.

    Change the Time Period for the Card Data

    All cards have a default time period for the data shown on the card, typically the last 24 hours. You can change the time period to view the data during a different time range to aid analysis of previous or existing issues.

    To change the time period for a card:

    1. Hover over any card.

    2. Click in the header.

    3. Select a time period from the dropdown list.

    Changing the time period in this manner only changes the time period for the given card.

    Switch to a Different Card Size

    You can switch between the different card sizes at any time. Only one size is visible at a time. To view the same card in different sizes, open a second copy of the card.

    To change the card size:

    1. Hover over the card.

    2. Hover over the Card Size Picker and move the cursor to the right or left until the desired size option is highlighted.

      Single width opens a small card. Double width opens a medium card. Triple width opens large cards. Full width opens full-screen cards.

    3. Click the Picker.
      The card changes to the selected size, and may move its location on the workbench.

    View a Description of the Card Content

    When you hover over a medium or large card, the bottom right corner turns up and is highlighted. Clicking the corner turns the card over where a description of the card and any relevant tabs are described. Hover and click again to turn it back to the front side.

    Reposition a Card on Your Workbench

    You can also move cards around on the workbench, using a simple drag and drop method.

    To move a card:

    1. Simply click and drag the card to left or right of another card, next to where you want to place the card.

    2. Release your hold on the card when the other card becomes highlighted with a dotted line. In this example, we are moving the medium Network Health card to the left of the medium Devices Inventory card.

    Table Settings

    You can manipulate the data in a data grid in a full-screen card in several ways. The available options are displayed above each table. The options vary depending on the card and what is selected in the table.

    IconActionDescription
    Select AllSelects all items in the list
    Clear AllClears all existing selections in the list.
    Add item to the list
    EditEdits the selected item
    DeleteRemoves the selected items
    FilterFilters the list using available parameters. Refer to Filter Table Data for more detail.
    , Generate/Delete AuthKeysCreates or removes NetQ CLI authorization keys
    Open CardsOpens the corresponding validation or trace card(s)
    Assign roleOpens role assignment options for switches
    ExportExports selected data into either a .csv or JSON-formatted file. Refer to Export Data for more detail.

    When there are numerous items in a table, NetQ loads the first 25 by default and provides the rest in additional table pages. In this case, pagination is shown under the table.

    From there, you can:

    Change Order of Columns

    You can rearrange the columns within a table. Click and hold on a column header, then drag it to the location where you want it.

    Filter Table Data

    The filter option associated with tables on full-screen cards can be used to filter the data by any parameter (column name). The parameters available vary according to the table you are viewing. Some tables offer the ability to filter on more than one parameter.

    Tables that Support a Single Filter

    Tables that allow a single filter to be applied let you select the parameter and set the value. You can use partial values.

    For example, to set the filter to show only BGP sessions using a particular VRF:

    1. Open the full-screen Network Services | All BGP Sessions card.

    2. Click the All Sessions tab.

    3. Click above the table.

    4. Select VRF from the Field dropdown.

    5. Enter the name of the VRF of interest. In our example, we chose vrf1.

    6. Click Apply.

      The filter icon displays a red dot to indicate filters are applied.

    7. To remove the filter, click (with the red dot).

    8. Click Clear.

    9. Close the Filters dialog by clicking .

    Tables that Support Multiple Filters

    For tables that offer filtering by multiple parameters, the Filter dialog is slightly different. For example, to filter the list of IP Addresses in your system by hostname and interface:

    1. Click .

    2. Select IP Addresses under Network.

    3. Click above the table.

    4. Enter a hostname and interface name in the respective fields.

    5. Click Apply.

      The filter icon displays a red dot to indicate filters are applied, and each filter is presented above the table.

    6. To remove a filter, simply click on the filter, or to remove all filters at once, click Clear All Filters.

    Export Data

    You can export tabular data from a full-screen card to a CSV- or JSON-formatted file.

    To export the all data:

    1. Click above the table.

    2. Select the export format.

    3. Click Export to save the file to your downloads directory.

    To export selected data:

    1. Select the individual items from the list by clicking in the checkbox next to each item.

    2. Click above the table.

    3. Select the export format.

    4. Click Export to save the file to your downloads directory.

    Set User Preferences

    Each user can customize the NetQ application display, change his account password, and manage his workbenches.

    Configure Display Settings

    The Display card contains the options for setting the application theme, language, time zone, and date formats. There are two themes available: a Light theme and a Dark theme (default). The screen captures in this document are all displayed with the Dark theme. English is the only language available for this release. You can choose to view data in the time zone where you or your data center resides. You can also select the date and time format, choosing words or number format and a 12- or 24-hour clock. All changes take effect immediately.

    To configure the display settings:

    1. Click in the application header to open the User Settings options.

    2. Click Profile & Preferences.

    3. Locate the Display card.

    4. In the Theme field, click to select your choice of theme. This figure shows the light theme. Switch back and forth as desired.

    5. In the Time Zone field, click to change the time zone from the default.
      By default, the time zone is set to the user’s local time zone. If a time zone has not been selected, NetQ defaults to the current local time zone where NetQ is installed. All time values are based on this setting. This is displayed in the application header, and is based on Greenwich Mean Time (GMT).

      Tip: You can also change the time zone from the header display.

      If your deployment is not local to you (for example, you want to view the data from the perspective of a data center in another time zone) you can change the display to another time zone. The following table presents a sample of time zones:

      Time ZoneDescriptionAbbreviation
      GMT +12New Zealand Standard TimeNST
      GMT +11Solomon Standard TimeSST
      GMT +10Australian Eastern TimeAET
      GMT +9:30Australia Central TimeACT
      GMT +9Japan Standard TimeJST
      GMT +8China Taiwan TimeCTT
      GMT +7Vietnam Standard TimeVST
      GMT +6Bangladesh Standard TimeBST
      GMT +5:30India Standard TimeIST
      GMT+5Pakistan Lahore TimePLT
      GMT +4Near East TimeNET
      GMT +3:30Middle East TimeMET
      GMT +3Eastern African Time/Arab Standard TimeEAT/AST
      GMT +2Eastern European TimeEET
      GMT +1European Central TimeECT
      GMTGreenwich Mean TimeGMT
      GMT -1Central African TimeCAT
      GMT -2Uruguay Summer TimeUYST
      GMT -3Argentina Standard/Brazil Eastern TimeAGT/BET
      GMT -4Atlantic Standard Time/Puerto Rico TimeAST/PRT
      GMT -5Eastern Standard TimeEST
      GMT -6Central Standard TimeCST
      GMT -7Mountain Standard TimeMST
      GMT -8Pacific Standard TimePST
      GMT -9Alaskan Standard TimeAST
      GMT -10Hawaiian Standard TimeHST
      GMT -11Samoa Standard TimeSST
      GMT -12New Zealand Standard TimeNST
    6. In the Date Format field, select the data and time format you want displayed on the cards.

      The four options include the date displayed in words or abbreviated with numbers, and either a 12- or 24-hour time representation. The default is the third option.

    7. Return to your workbench by clicking and selecting a workbench from the NetQ list.

    Change Your Password

    You can change your account password at any time should you suspect someone has hacked your account or your administrator requests you to do so.

    To change your password:

    1. Click in the application header to open the User Settings options.

    2. Click Profile & Preferences.

    3. Locate the Basic Account Info card.

    4. Click Change Password.

    5. Enter your current password.

    6. Enter and confirm a new password.

    7. Click Save to change to the new password, or click Cancel to discard your changes.

    8. Return to your workbench by clicking and selecting a workbench from the NetQ list.

    Manage Your Workbenches

    You can view all of your workbenches in a list form, making it possible to manage various aspects of them. There are public and private workbenches. Public workbenches are visible by all users. Private workbenches are visible only by the user who created the workbench. From the Workbenches card, you can:

    To manage your workbenches:

    1. Click in the application header to open the User Settings options.

    2. Click Profile & Preferences.

    3. Locate the Workbenches card.

    4. To specify a home workbench, click to the left of the desired workbench name. is placed there to indicate its status as your favorite workbench.

    5. To search the workbench list by name, access type, and cards present on the workbench, click the relevant header and begin typing your search criteria.

    6. To sort the workbench list, click the relevant header and click .

    7. To delete a workbench, hover over the workbench name to view the Delete button. As an administrator, you can delete both private and public workbenches.

    8. Return to your workbench by clicking and selecting a workbench from the NetQ list.

    Monitor Events

    Two event workflows, the Alarms card workflow and the Info card workflow, provide a view into the events occurring in the network. The Alarms card workflow tracks critical severity events, whereas the Info card workflow tracks all warning, info, and debug severity events.

    To focus on events from a single device perspective, refer to Monitor Switches.

    Monitor Critical Events

    You can easily monitor critical events occurring across your network using the Alarms card. You can determine the number of events for the various system, interface, and network protocols and services components in the network. The content of the cards in the workflow is described first, and then followed by common tasks you would perform using this card workflow.

    Alarms Card Workflow Summary

    The small Alarms card displays:

    ItemDescription
    Indicates data is for all critical severity events in the network
    Alarm trendTrend of alarm count, represented by an arrow:
    • Pointing upward and bright pink: alarm count is higher than the last two time periods, an increasing trend
    • Pointing downward and green: alarm count is lower than the last two time periods, a decreasing trend
    • No arrow: alarm count is unchanged over the last two time periods, trend is steady
    Alarm scoreCurrent count of alarms during the designated time period
    Alarm ratingCount of alarms relative to the average count of alarms during the designated time period:
    • Low: Count of alarms is below the average count; a nominal count
    • Med: Count of alarms is in range of the average count; some room for improvement
    • High: Count of alarms is above the average count; user intervention recommended
    ChartDistribution alarms received during the designated time period and a total count of all alarms present in the system

    The medium Alarms card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all critical events in the network
    CountTotal number of alarms received during the designated time period
    Alarm scoreCurrent count of alarms received from each category (overall, system, interface, and network services) during the designated time period
    ChartDistribution of all alarms received from each category during the designated time period

    The large Alarms card has one tab.

    The Alarm Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all system, trace and interface critical events in the network
    Alarm Distribution

    Chart: Distribution of all alarms received from each category during the designated time period:

    • NetQ Agent
    • BTRFS Information
    • CL Support
    • Config Diff
    • CL License
    • Installed Packages
    • Link
    • LLDP
    • MTU
    • Node
    • Port
    • Resource
    • Running Config Diff
    • Sensor
    • Services
    • SSD Utilization
    • TCA Interface Stats
    • TCA Resource Utilization
    • TCA Sensors
    The category with the largest number of alarms is shown at the top, followed by the next most, down to the chart with the fewest alarms.

    Count: Total number of alarms received from each category during the designated time period

    TableListing of items that match the filter selection for the selected alarm categories:
    • Events by Most Recent: Most recent event are listed at the top
    • Devices by Event Count: Devices with the most events are listed at the top
    Show All EventsOpens full screen Events | Alarms card with a listing of all events

    The full screen Alarms card provides tabs for all events.

    ItemDescription
    TitleEvents | Alarms
    Closes full screen card and returns to workbench
    Default TimeRange of time in which the displayed data was collected
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All EventsDisplays all events (both alarms and info) received in the time period. By default, the requests list is sorted by the date and time that the event occurred (Time). This tab provides the following additional data about each request:
    • Source: Hostname of the given event
    • Message: Text describing the alarm or info event that occurred
    • Type: Name of network protocol and/or service that triggered the given event
    • Severity: Importance of the event-critical, warning, info, or debug
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Alarm Status Summary

    A summary of the critical alarms in the network includes the number of alarms, a trend indicator, a performance indicator, and a distribution of those alarms.

    To view the summary, open the small Alarms card.

    In this example, there are a small number of alarms (2), the number of alarms is decreasing (down arrow), and there are fewer alarms right now than the average number of alarms during this time period. This would indicate no further investigation is needed. Note that with such a small number of alarms, the rating may be a bit skewed.

    View the Distribution of Alarms

    It is helpful to know where and when alarms are occurring in your network. The Alarms card workflow enables you to see the distribution of alarms based on its source: network services, interfaces, system services, and threshold-based events.

    To view the alarm distribution, open the medium Alarms card. Scroll down to view all of the charts.

    Monitor Alarm Details

    The Alarms card workflow enables users to easily view and track critical severity alarms occurring anywhere in your network. You can sort alarms based on their occurrence or view devices with the most network services alarms.

    To view critical alarms, open the large Alarms card.

    From this card, you can view the distribution of alarms for each of the categories over time. The charts are sorted by total alarm count, with the highest number of alarms i a category listed at the top. Scroll down to view any hidden charts. A list of the associated alarms is also displayed. By default, the list of the most recent alarms is displayed when viewing the large card.

    View Devices with the Most Alarms

    You can filter instead for the devices that have the most alarms.

    To view devices with the most alarms, open the large Alarms card, and then select Devices by event count from the dropdown.

    You can open the switch card for any of the listed devices by clicking on the device name.

    Filter Alarms by Category

    You can focus your view to include alarms for one or more selected alarm categories.

    To filter for selected categories:

    1. Click the checkbox to the left of one or more charts to remove that set of alarms from the table on the right.

    2. Select the Devices by event count to view the devices with the most alarms for the selected categories.

    3. Switch back to most recent events by selecting Events by most recent.

    4. Click the checkbox again to return a category’s data to the table.

    In this example, we removed the Services from the event listing.

    Compare Alarms with a Prior Time

    You can change the time period for the data to compare with a prior time. If the same devices are consistently indicating the most alarms, you might want to look more carefully at those devices using the Switches card workflow.

    To compare two time periods:

    1. Open a second Alarm Events card. Remember it goes to the bottom of the workbench.

    2. Switch to the large size view.

    3. Move the card to be next to the original Alarm Events card. Note that moving large cards can take a few extra seconds since they contain a large amount of data.

    4. Hover over the card and click .

    5. Select a different time period.

    6. Compare the two cards with the Devices by event count filter applied.

      In this example, the total alarm count and the devices with the most alarms in each time period have changed for the better overall. You could go back further in time or investigate the current status of the largest offenders.

    View All Events

    You can view all events in the network either by clicking the Show All Events link under the table on the large Alarm Events card, or by opening the full screen Alarm Events card.

    OR

    To return to your workbench, click in the top right corner of the card.

    Monitor Informational Events

    You can easily monitor warning, info, and debug severity events occurring across your network using the Info card. You can determine the number of events for the various system, interface, and network protocols and services components in the network. The content of the cards in the workflow is described first, and then followed by common tasks you would perform using this card workflow.

    Info Card Workflow Summary

    The Info card workflow enables users to easily view and track informational alarms occurring anywhere in your network.

    The small Info card displays:

    ItemDescription
    Indicates data is for all warning, info, and debug severity events in the network
    Info countNumber of info events received during the designated time period
    Alarm countNumber of alarm events received during the designated time period
    ChartDistribution of all info events and alarms received during the designated time period

    The medium Info card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all warning, info, and debug severity events in the network
    Types of InfoChart which displays the services that have triggered events during the designated time period. Hover over chart to view a count for each type.
    Distribution of InfoInfo Status
    • Count: Number of info events received during the designated time period
    • Chart: Distribution of all info events received during the designated time period
    Alarms Status
    • Count: Number of alarm events received during the designated time period
    • Chart: Distribution of all alarm events received during the designated time period

    The large Info card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all warning, info, and debug severity events in the network
    Types of InfoChart which displays the services that have triggered events during the designated time period. Hover over chart to view a count for each type.
    Distribution of InfoInfo Status
    • Count: Current number of info events received during the designated time period
    • Chart: Distribution of all info events received during the designated time period
    Alarms Status
    • Count: Current number of alarm events received during the designated time period
    • Chart: Distribution of all alarm events received during the designated time period
    TableListing of items that match the filter selection:
    • Events by Most Recent: Most recent event are listed at the top
    • Devices by Event Count: Devices with the most events are listed at the top
    Show All EventsOpens full screen Events | Info card with a listing of all events

    The full screen Info card provides tabs for all events.

    ItemDescription
    TitleEvents | Info
    Closes full screen card and returns to workbench
    Default TimeRange of time in which the displayed data was collected
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All EventsDisplays all events (both alarms and info) received in the time period. By default, the requests list is sorted by the date and time that the event occurred (Time). This tab provides the following additional data about each request:
    • Source: Hostname of the given event
    • Message: Text describing the alarm or info event that occurred
    • Type: Name of network protocol and/or service that triggered the given event
    • Severity: Importance of the event-critical, warning, info, or debug
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Info Status Summary

    A summary of the informational events occurring in the network can be found on the small, medium, and large Info cards. Additional details are available as you increase the size of the card.

    To view the summary with the small Info card, simply open the card. This card gives you a high-level view in a condensed visual, including the number and distribution of the info events along with the alarms that have occurred during the same time period.

    To view the summary with the medium Info card, simply open the card. This card gives you the same count and distribution of info and alarm events, but it also provides information about the sources of the info events and enables you to view a small slice of time using the distribution charts.

    Use the chart at the top of the card to view the various sources of info events. The four or so types with the most info events are called out separately, with all others collected together into an Other category. Hover over segment of chart to view the count for each type.

    To view the summary with the large Info card, open the card. The left side of the card provides the same capabilities as the medium Info card.

    Compare Timing of Info and Alarm Events

    While you can see the relative relationship between info and alarm events on the small Info card, the medium and large cards provide considerably more information. Open either of these to view individual line charts for the events. Generally, alarms have some corollary info events. For example, when a network service becomes unavailable, a critical alarm is often issued, and when the service becomes available again, an info event of severity warning is generated. For this reason, you might see some level of tracking between the info and alarm counts and distributions. Some other possible scenarios:

    View All Info Events Sorted by Time of Occurrence

    You can view all info events using the large Info card. Open the large card and confirm the Events By Most Recent option is selected in the filter above the table on the right. When this option is selected, all of the info events are listed with the most recently occurring event at the top. Scrolling down shows you the info events that have occurred at an earlier time within the selected time period for the card.

    View Devices with the Most Info Events

    You can filter instead for the devices that have the most info events by selecting the Devices by Event Count option from the filter above the table.

    You can open the switch card for any of the listed devices by clicking on the device name.

    View All Events

    You can view all events in the network either by clicking the Show All Events link under the table on the large Info Events card, or by opening the full screen Info Events card.

    OR

    To return to your workbench, click in the top right corner of the card.

    Events Reference

    The following table lists all event messages organized by type.

    The messages can be viewed through third-party notification applications. For details about configuring notifications using the NetQ CLI, refer to Integrate NetQ with Notification Applications.

    For information about configuring threshold-based events (TCAs), refer to Application Management.

    TypeTriggerSeverityMessage FormatExample
    agentNetQ Agent state changed to Rotten (not heard from in over 15 seconds)CriticalAgent state changed to rottenAgent state changed to rotten
    agentNetQ Agent rebootedCriticalNetq-agent rebooted at (@last_boot)Netq-agent rebooted at 1573166417
    agentNode running NetQ Agent rebootedCriticalSwitch rebooted at (@sys_uptime)Switch rebooted at 1573166131
    agentNetQ Agent state changed to FreshInfoAgent state changed to freshAgent state changed to fresh
    agentNetQ Agent state was resetInfoAgent state was paused and resumed at (@last_reinit)Agent state was paused and resumed at 1573166125
    agentVersion of NetQ Agent has changedInfoAgent version has been changed old_version:@old_version and new_version:@new_version. Agent reset at @sys_uptimeAgent version has been changed old_version:2.1.2 and new_version:2.3.1. Agent reset at 1573079725
    bgpBGP Session state changedCriticalBGP session with peer @peer @neighbor vrf @vrf state changed from @old_state to @new_stateBGP session with peer leaf03 leaf04 vrf mgmt state changed from Established to NotEstd
    bgpBGP Session state changed from Failed to EstablishedInfoBGP session with peer @peer @peerhost @neighbor vrf @vrf session state changed from failed to EstablishedBGP session with peer swp5 spine02 spine03 vrf default session state changed from failed to Established
    bgpBGP Session state changed from Established to FailedInfoBGP session with peer @peer @neighbor vrf @vrf state changed from established to failedBGP session with peer leaf03 leaf04 vrf mgmt state changed from down to up
    bgpThe reset time for a BGP session changedInfoBGP session with peer @peer @neighbor vrf @vrf reset time changed from @old_last_reset_time to @new_last_reset_timeBGP session with peer spine03 swp9 vrf vrf2 reset time changed from 1559427694 to 1559837484
    btrfsinfoDisk space available after BTRFS allocation is less than 80% of partition size or only 2 GB remain.Critical@info : @detailshigh btrfs allocation space : greater than 80% of partition size, 61708420
    btrfsinfoIndicates if space would be freed by a rebalance operation on the diskCritical@info : @detailsdata storage efficiency : space left after allocation greater than chunk size 6170849.2","
    cableLink speed is not the same on both ends of the linkCritical@ifname speed @speed, mismatched with peer @peer @peer_if speed @peer_speedswp2 speed 10, mismatched with peer server02 swp8 speed 40
    cableThe speed setting for a given port changedInfo@ifname speed changed from @old_speed to @new_speedswp9 speed changed from 10 to 40
    cableThe transceiver status for a given port changedInfo@ifname transceiver changed from @old_transceiver to @new_transceiverswp4 transceiver changed from disabled to enabled
    cableThe vendor of a given transceiver changedInfo@ifname vendor name changed from @old_vendor_name to @new_vendor_nameswp23 vendor name changed from Broadcom to Mellanox
    cableThe part number of a given transceiver changedInfo@ifname part number changed from @old_part_number to @new_part_numberswp7 part number changed from FP1ZZ5654002A to MSN2700-CS2F0
    cableThe serial number of a given transceiver changedInfo@ifname serial number changed from @old_serial_number to @new_serial_numberswp4 serial number changed from 571254X1507020 to MT1552X12041
    cableThe status of forward error correction (FEC) support for a given port changedInfo@ifname supported fec changed from @old_supported_fec to @new_supported_fecswp12 supported fec changed from supported to unsupported

    swp12 supported fec changed from unsupported to supported

    cableThe advertised support for FEC for a given port changedInfo@ifname supported fec changed from @old_advertised_fec to @new_advertised_fecswp24 supported FEC changed from advertised to not advertised
    cableThe FEC status for a given port changedInfo@ifname fec changed from @old_fec to @new_fecswp15 fec changed from disabled to enabled
    clagCLAG remote peer state changed from up to downCriticalPeer state changed to downPeer state changed to down
    clagLocal CLAG host MTU does not match its remote peer MTUCriticalSVI @svi1 on vlan @vlan mtu @mtu1 mismatched with peer mtu @mtu2SVI svi7 on vlan 4 mtu 1592 mistmatched with peer mtu 1680
    clagCLAG SVI on VLAN is missing from remote peer stateWarningSVI on vlan @vlan is missing from peerSVI on vlan vlan4 is missing from peer
    clagCLAG peerlink is not opperating at full capacity. At least one link is down.WarningClag peerlink not at full redundancy, member link @slave is downClag peerlink not at full redundancy, member link swp40 is down
    clagCLAG remote peer state changed from down to upInfoPeer state changed to upPeer state changed to up
    clagLocal CLAG host state changed from down to upInfoClag state changed from down to upClag state changed from down to up
    clagCLAG bond in Conflicted state was updated with new bondsInfoClag conflicted bond changed from @old_conflicted_bonds to @new_conflicted_bondsClag conflicted bond changed from swp7 swp8 to @swp9 swp10
    clagCLAG bond changed state from protodown to up stateInfoClag conflicted bond changed from @old_state_protodownbond to @new_state_protodownbondClag conflicted bond changed from protodown to up
    clsupportA new CL Support file has been created for the given nodeCriticalHostName @hostname has new CL SUPPORT fileHostName leaf01 has new CL SUPPORT file
    configdiffConfiguration file deleted on a deviceCritical@hostname config file @type was deletedspine03 config file /etc/frr/frr.conf was deleted
    configdiffConfiguration file has been createdInfo@hostname config file @type was createdleaf12 config file /etc/lldp.d/README.conf was created
    configdiffConfiguration file has been modifiedInfo@hostname config file @type was modifiedspine03 config file /etc/frr/frr.conf was modified
    evpnA VNI was configured and moved from the up state to the down stateCriticalVNI @vni state changed from up to downVNI 36 state changed from up to down
    evpnA VNI was configured and moved from the down state to the up stateInfoVNI @vni state changed from down to upVNI 36 state changed from down to up
    evpnThe kernel state changed on a VNIInfoVNI @vni kernel state changed from @old_in_kernel_state to @new_in_kernel_stateVNI 3 kernel state changed from down to up
    evpnA VNI state changed from not advertising all VNIs to advertising all VNIsInfoVNI @vni vni state changed from @old_adv_all_vni_state to @new_adv_all_vni_stateVNI 11 vni state changed from false to true
    licenseLicense state is missing or invalidCriticalLicense check failed, name @lic_name state @stateLicense check failed, name agent.lic state invalid
    licenseLicense state is missing or invalid on a particular deviceCriticalLicense check failed on @hostnameLicense check failed on leaf03
    linkLink operational state changed from up to downCriticalHostName @hostname changed state from @old_state to @new_state Interface:@ifnameHostName leaf01 changed state from up to down Interface:swp34
    linkLink operational state changed from down to upInfoHostName @hostname changed state from @old_state to @new_state Interface:@ifnameHostName leaf04 changed state from down to up Interface:swp11
    lldpLocal LLDP host has new neighbor informationInfoLLDP Session with host @hostname and @ifname modified fields @changed_fieldsLLDP Session with host leaf02 swp6 modified fields leaf06 swp21
    lldpLocal LLDP host has new peer interface nameInfoLLDP Session with host @hostname and @ifname @old_peer_ifname changed to @new_peer_ifnameLLDP Session with host spine01 and swp5 swp12 changed to port12
    lldpLocal LLDP host has new peer hostnameInfoLLDP Session with host @hostname and @ifname @old_peer_hostname changed to @new_peer_hostnameLLDP Session with host leaf03 and swp2 leaf07 changed to exit01
    mtuVLAN interface link MTU is smaller than that of its parent MTUWarningvlan interface @link mtu @mtu is smaller than parent @parent mtu @parent_mtuvlan interface swp3 mtu 1500 is smaller than parent peerlink-1 mtu 1690
    mtuBridge interface MTU is smaller than the member interface with the smallest MTUWarningbridge @link mtu @mtu is smaller than least of member interface mtu @minbridge swp0 mtu 1280 is smaller than least of member interface mtu 1500
    ntpNTP sync state changed from in sync to not in syncCriticalSync state changed from @old_state to @new_state for @hostnameSync state changed from in sync to not sync for leaf06
    ntpNTP sync state changed from not in sync to in syncInfoSync state changed from @old_state to @new_state for @hostnameSync state changed from not sync to in sync for leaf06
    ospfOSPF session state on a given interface changed from Full to a down stateCriticalOSPF session @ifname with @peer_address changed from Full to @down_state

    OSPF session swp7 with 27.0.0.18 state changed from Full to Fail

    OSPF session swp7 with 27.0.0.18 state changed from Full to ExStart

    ospfOSPF session state on a given interface changed from a down state to fullInfoOSPF session @ifname with @peer_address changed from @down_state to Full

    OSPF session swp7 with 27.0.0.18 state changed from Down to Full

    OSPF session swp7 with 27.0.0.18 state changed from Init to Full

    OSPF session swp7 with 27.0.0.18 state changed from Fail to Full

    packageinfoPackage version on device does not match the version identified in the existing manifestCritical@package_name manifest version mismatchnetq-apps manifest version mismatch
    ptmPhysical interface cabling does not match configuration specified in topology.dot fileCriticalPTM cable status failedPTM cable status failed
    ptmPhysical interface cabling matches configuration specified in topology.dot fileCriticalPTM cable status passedPTM cable status passed
    resourceA physical resource has been deleted from a deviceCriticalResource Utils deleted for @hostnameResource Utils deleted for spine02
    resourceRoot file system access on a device has changed from Read/Write to Read OnlyCritical@hostname root file system access mode set to Read Onlyserver03 root file system access mode set to Read Only
    resourceRoot file system access on a device has changed from Read Only to Read/WriteInfo@hostname root file system access mode set to Read/Writeleaf11 root file system access mode set to Read/Write
    resourceA physical resource has been added to a deviceInfoResource Utils added for @hostnameResource Utils added for spine04
    runningconfigdiffRunning configuration file has been modifiedInfo@commandname config result was modified@commandname config result was modified
    sensorA fan or power supply unit sensor has changed stateCriticalSensor @sensor state changed from @old_s_state to @new_s_stateSensor fan state changed from up to down
    sensorA temperature sensor has crossed the maximum threshold for that sensorCriticalSensor @sensor max value @new_s_max exceeds threshold @new_s_critSensor temp max value 110 exceeds the threshold 95
    sensorA temperature sensor has crossed the minimum threshold for that sensorCriticalSensor @sensor min value @new_s_lcrit fall behind threshold @new_s_minSensor psu min value 10 fell below threshold 25
    sensorA temperature, fan, or power supply sensor state changedInfoSensor @sensor state changed from @old_state to @new_state

    Sensor temperature state changed from critical to ok

    Sensor fan state changed from absent to ok

    Sensor psu state changed from bad to ok

    sensorA fan or power supply sensor state changedInfoSensor @sensor state changed from @old_s_state to @new_s_state

    Sensor fan state changed from down to up

    Sensor psu state changed from down to up

    servicesA service status changed from down to upCriticalService @name status changed from @old_status to @new_statusService bgp status changed from down to up
    servicesA service status changed from up to downCriticalService @name status changed from @old_status to @new_statusService lldp status changed from up to down
    servicesA service changed state from inactive to activeInfoService @name changed state from inactive to active

    Service bgp changed state from inactive to active

    Service lldp changed state from inactive to active

    ssdutil3ME3 disk health has dropped below 10%Critical@info: @detailslow health : 5.0%
    ssdutilA dip in 3ME3 disk health of more than 2% has occured within the last 24 hoursCritical@info: @detailssignificant health drop : 3.0%
    tcaPercentage of CPU utilization exceeded user-defined maximum threshold on a switchCriticalCPU Utilization for host @hostname exceed configured mark @cpu_utilizationCPU Utilization for host leaf11 exceed configured mark 85
    tcaPercentage of disk utilization exceeded user-defined maximum threshold on a switchCriticalDisk Utilization for host @hostname exceed configured mark @disk_utilizationDisk Utilization for host leaf11 exceed configured mark 90
    tcaPercentage of memory utilization exceeded user-defined maximum threshold on a switchCriticalMemory Utilization for host @hostname exceed configured mark @mem_utilizationMemory Utilization for host leaf11 exceed configured mark 95
    tcaNumber of transmit bytes exceeded user-defined maximum threshold on a switch interfaceCriticalTX bytes upper threshold breached for host @hostname ifname:@ifname value: @tx_bytesTX bytes upper threshold breached for host spine02 ifname:swp4 value: 20000
    tcaNumber of broadcast transmit bytes exceeded user-defined maximum threshold on a switch interfaceCriticalTX broadcast upper threshold breached for host @hostname ifname:@ifname value: @rx_broadcastTX broadcast upper threshold breached for host leaf04 ifname:swp45 value: 40200
    tcaNumber of multicast transmit bytes exceeded user-defined maximum threshold on a switch interfaceCriticalTX multicast upper threshold breached for host @hostname ifname:@ifname value: @rx_broadcastTX multicast upper threshold breached for host leaf04 ifname:swp45 value: 30000
    tcaNumber of receive bytes exceeded user-defined maximum threshold on a switch interfaceCriticalRX bytes upper threshold breached for host @hostname ifname:@ifname value: @tx_bytesRX bytes upper threshold breached for host spine02 ifname:swp4 value: 20000
    tcaNumber of broadcast receive bytes exceeded user-defined maximum threshold on a switch interfaceCriticalRX broadcast upper threshold breached for host @hostname ifname:@ifname value: @rx_broadcastRX broadcast upper threshold breached for host leaf04 ifname:swp45 value: 40200
    tcaNumber of multicast receive bytes exceeded user-defined maximum threshold on a switch interfaceCriticalRX multicast upper threshold breached for host @hostname ifname:@ifname value: @rx_broadcastRX multicast upper threshold breached for host leaf04 ifname:swp45 value: 30000
    tcaFan speed exceeded user-defined maximum threshold on a switchCriticalSensor for @hostname exceeded threshold fan speed @s_input for sensor @s_nameSensor for spine03 exceeded threshold fan speed 700 for sensor fan2
    tcaPower supply output exceeded user-defined maximum threshold on a switchCriticalSensor for @hostname exceeded threshold power @s_input watts for sensor @s_nameSensor for leaf14 exceeded threshold power 120 watts for sensor psu1
    tcaTemperature (° C) exceeded user-defined maximum threshold on a switchCriticalSensor for @hostname exceeded threshold temperature @s_input for sensor @s_nameSensor for leaf14 exceeded threshold temperature 90 for sensor temp1
    tcaPower supply voltage exceeded user-defined maximum threshold on a switchCriticalSensor for @hostname exceeded threshold voltage @s_input volts for sensor @s_nameSensor for leaf14 exceeded threshold voltage 12 volts for sensor psu2
    versionAn unknown version of the operating system was detectedCriticalunexpected os version @my_verunexpected os version cl3.2
    versionDesired version of the operating system is not availableCriticalos version @veros version cl3.7.9
    versionAn unknown version of a software package was detectedCriticalexpected release version @verexpected release version cl3.6.2
    versionDesired version of a software package is not availableCriticaldifferent from version @verdifferent from version cl4.0
    vxlanReplication list is contains an inconsistent set of nodes<>Critical<>VNI @vni replication list inconsistent with @conflicts diff:@diff<>VNI 14 replication list inconsistent with ["leaf03","leaf04"] diff:+:["leaf03","leaf04"] -:["leaf07","leaf08"]

    Monitor Network Performance

    The core capabilities of Cumulus NetQ enable you to monitor your network by viewing performance and configuration data about your individual network devices and the entire fabric network-wide. The topics contained in this section describe monitoring tasks that apply across the entire network. For device-specific monitoring refer to Monitor Devices.

    Monitor Network Health

    As with any network, one of the challenges is keeping track of all of the moving parts. With the NetQ GUI, you can view the overall health of your network at a glance and then delve deeper for periodic checks or as conditions arise that require attention. For a general understanding of how well your network is operating, the Network Health card workflow is the best place to start as it contains the highest view and performance roll-ups.

    Network Health Card Workflow Summary

    The small Network Health card displays:

    ItemDescription
    Indicates data is for overall Network Health
    Health trendTrend of overall network health, represented by an arrow:
    • Pointing upward and green: Health score in the most recent window is higher than in the last two data collection windows, an increasing trend
    • Pointing downward and bright pink: Health score in the most recent window is lower than in the last two data collection windows, a decreasing trend
    • No arrow: Health score is unchanged over the last two data collection windows, trend is steady

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health score

    Average of health scores for system health, network services health, and interface health during the last data collection window. The health score for each category is calculated as the percentage of items which passed validations versus the number of items checked.

    The collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health ratingPerformance rating based on the health score during the time window:
    • Low: Health score is less than 40%
    • Med: Health score is between 40% and 70%
    • High: Health score is greater than 70%
    ChartDistribution of overall health status during the designated time period

    The medium Network Health card displays the distribution, score, and trend of the:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for overall Network Health
    Health trendTrend of system, network service, and interface health, represented by an arrow:
    • Pointing upward and green: Health score in the most recent window is higher than in the last two data collection windows, an increasing trend
    • Pointing downward and bright pink: Health score in the most recent window is lower than in the last two data collection windows, a decreasing trend
    • No arrow: Health score is unchanged over the last two data collection windows, trend is steady

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health scorePercentage of devices which passed validation versus the number of devices checked during the time window for:
    • System health: NetQ Agent health, Cumulus Linux license status, and sensors
    • Network services health: BGP, CLAG, EVPN, NTP, OSPF, and VXLAN health
    • Interface health: interfaces MTU, VLAN health

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    ChartDistribution of overall health status during the designated time period

    The large Network Health card contains three tabs.

    The System Health tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for System Health
    Health trendTrend of NetQ Agents, Cumulus Linux licenses, and sensor health, represented by an arrow:
    • Pointing upward and green: Health score in the most recent window is higher than in the last two data collection windows, an increasing trend
    • Pointing downward and bright pink: Health score in the most recent window is lower than in the last two data collection windows, a decreasing trend
    • No arrow: Health score is unchanged over the last two data collection windows, trend is steady

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health score

    Percentage of devices which passed validation versus the number of devices checked during the time window for NetQ Agents, Cumulus Linux license status, and platform sensors.

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    ChartsDistribution of health score for NetQ Agents, Cumulus Linux license status, and platform sensors during the designated time period
    TableListing of items that match the filter selection:
    • Most Failures: Devices with the most validation failures are listed at the top
    • Recent Failures: Most recent validation failures are listed at the top
    Show All ValidationsOpens full screen Network Health card with a listing of validations performed by network service and protocol

    The Network Service Health tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for Network Protocols and Services Health
    Health trendTrend of BGP, CLAG, EVPN, NTP, OSPF, and VXLAN services health, represented by an arrow:
    • Pointing upward and green: Health score in the most recent window is higher than in the last two data collection windows, an increasing trend
    • Pointing downward and bright pink: Health score in the most recent window is lower than in the last two data collection windows, a decreasing trend
    • No arrow: Health score is unchanged over the last two data collection windows, trend is steady

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health score

    Percentage of devices which passed validation versus the number of devices checked during the time window for BGP, CLAG, EVPN, NTP, and VXLAN protocols and services.

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    ChartsDistribution of passing validations for BGP, CLAG, EVPN, NTP, and VXLAN services during the designated time period
    TableListing of devices that match the filter selection:
    • Most Failures: Devices with the most validation failures are listed at the top
    • Recent Failures: Most recent validation failures are listed at the top
    Show All ValidationsOpens full screen Network Health card with a listing of validations performed by network service and protocol

    The Interface Health tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for Interface Health
    Health trendTrend of interfaces, VLAN, and MTU health, represented by an arrow:
    • Pointing upward and green: Health score in the most recent window is higher than in the last two data collection windows, an increasing trend
    • Pointing downward and bright pink: Health score in the most recent window is lower than in the last two data collection windows, a decreasing trend
    • No arrow: Health score is unchanged over the last two data collection windows, trend is steady

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    Health score

    Percentage of devices which passed validation versus the number of devices checked during the time window for interfaces, VLAN, and MTU protocols and ports.

    The data collection window varies based on the time period of the card. For a 24 hour time period (default), the window is one hour. This gives you current, hourly, updates about your network health.

    ChartsDistribution of passing validations for interfaces, VLAN, and MTU protocols and ports during the designated time period
    TableListing of devices that match the filter selection:
    • Most Failures: Devices with the most validation failures are listed at the top
    • Recent Failures: Most recent validation failures are listed at the top
    Show All ValidationsOpens full screen Network Health card with a listing of validations performed by network service and protocol

    The full screen Network Health card displays all events in the network.

    ItemDescription
    TitleNetwork Health
    Closes full screen card and returns to workbench
    Default TimeRange of time in which the displayed data was collected
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    Network protocol or service tabDisplays results of that network protocol or service validations that occurred during the designated time period. By default, the requests list is sorted by the date and time that the validation was completed (Time). This tab provides the following additional data about all protocols and services:
    • Validation Label: User-defined name of a validation or Default validation
    • Total Node Count: Number of nodes running the protocol or service
    • Checked Node Count: Number of nodes running the protocol or service included in the validation
    • Failed Node Count: Number of nodes that failed the validation
    • Rotten Node Count: Number of nodes that were unreachable during the validation run
    • Warning Node Count: Number of nodes that had errors during the validation run

    The following protocols and services have additional data:

    • BGP
      • Total Session Count: Number of sessions running BGP included in the validation
      • Failed Session Count: Number of BGP sessions that failed the validation
    • EVPN
      • Total Session Count: Number of sessions running BGP included in the validation
      • Checked VNIs Count: Number of VNIs included in the validation
      • Failed BGP Session Count: Number of BGP sessions that failed the validation
    • Interfaces
      • Checked Port Count: Number of ports included in the validation
      • Failed Port Count: Number of ports that failed the validation.
      • Unverified Port Count: Number of ports where a peer could not be identified
    • Licenses
      • Checked License Count: Number of licenses included in the validation
      • Failed License Count: Number of licenses that failed the validation
    • MTU
      • Total Link Count: Number of links included in the validation
      • Failed Link Count: Number of links that failed the validation
    • NTP
      • Unknown Node Count: Number of nodes that NetQ sees but are not in its inventory an thus not included in the validation
    • OSPF
      • Total Adjacent Count: Number of adjacencies included in the validation
      • Failed Adjacent Count: Number of adjacencies that failed the validation
    • Sensors
      • Checked Sensor Count: Number of sensors included in the validation
      • Failed Sensor Count: Number of sensors that failed the validation
    • VLAN
      • Total Link Count: Number of links included in the validation
      • Failed Link Count: Number of links that failed the validation

    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Network Health Summary

    Overall network health is based on successful validation results. The summary includes the percentage of successful results, a trend indicator, and a distribution of the validation results.

    To view a summary of your network health, open the small Network Health card.

    In this example, the overall health is relatively good, but improving compared to recent status. Refer to the next section for viewing the key health metrics.

    View Key Metrics of Network Health

    Overall network health is a calculated average of several key health metrics: System, Network Services, and Interface health.

    To view these key metrics, open the medium Network Health card. Each metric is shown with percentage of successful validations, a trend indicator, and a distribution of the validation results.

    In this example, the health of each of the system and network services are good, but interface health is on the lower side. While it is improving, you might choose to dig further if it does not continue to improve. Refer to the following section for additional details.

    View System Health

    The system health is a calculated average of the NetQ Agent, Cumulus Linux license, and sensor health metrics. In all cases, validation is performed on the agents and licenses. If you are monitoring platform sensors, the calculation includes these as well. You can view the overall health of the system from the medium Network Health card and information about each component from the System Health tab on the large Network Health card.

    To view information about each system component:

    1. Open the large Network Health card.

    2. Hover over the card and click .

      The health of each system protocol or service is represented on the left side of the card by a distribution of the health score, a trend indicator, and a percentage of successful results. The right side of the card provides a listing of devices running the services.

    View Devices with the Most Issues

    It is useful to know which devices are experiencing the most issues with their system services in general, as this can help focus troubleshooting efforts toward selected devices versus the service itself. To view devices with the most issues, select Most Failures from the filter above the table on the right.

    Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Event cards and filter on the indicated switches.

    View Devices with Recent Issues

    It is useful to know which devices are experiencing the most issues with their system services right now, as this can help focus troubleshooting efforts toward selected devices versus the service itself. To view devices with recent issues, select Recent Failures from the filter above the table on the right. Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Switch card or the Event cards and filter on the indicated switches.

    Filter Results by System Service

    You can focus the data in the table on the right, by unselecting one or more services. Click the checkbox next to the service you want to remove from the data. In this example, we have unchecked Licenses.

    This removes the checkbox next to the associated chart and grays out the title of the chart, temporarily removing the data related to that service from the table. Add it back by hovering over the chart and clicking the checkbox that appears.

    View Details of a Particular System Service

    From the System Health tab on the large Network Health card you can click on a chart to take you to the full-screen card pre-focused on that service data.

    View Network Services Health

    The network services health is a calculated average of the individual network protocol and services health metrics. In all cases, validation is performed on NTP. If you are running BGP, CLAG, EVPN, OSPF, or VXLAN protocols the calculation includes these as well. You can view the overall health of network services from the medium Network Health card and information about individual services from the Network Service Health tab on the large Network Health card.

    To view information about each network protocol or service:

    1. Open the large Network Health card.

    2. Hover over the card and click .

    The health of each network protocol or service is represented on the left side of the card by a distribution of the health score, a trend indicator, and a percentage of successful results. The right side of the card provides a listing of devices running the services.

    If you have more services running than fit naturally into the chart area, a scroll bar appears for you to access their data. Use the scroll bars on the table to view more columns and rows.

    View Devices with the Most Issues

    It is useful to know which devices are experiencing the most issues with their system services in general, as this can help focus troubleshooting efforts toward selected devices versus the protocol or service. To view devices with the most issues, open the large Network Health card, then click the Network Services tab. Select Most Failures from the dropdown above the table on the right.

    Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Event cards and filter on the indicated switches.

    View Devices with Recent Issues

    It is useful to know which devices are experiencing the most issues with their network services right now, as this can help focus troubleshooting efforts toward selected devices versus the protocol or service. To view devices with the most issues, open the large Network Health card. Select Recent Failures from the dropdown above the table on the right. Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Switch card or the Event cards and filter on the indicated switches.

    Filter Results by Network Service

    You can focus the data in the table on the right, by unselecting one or more services. Click the checkbox next to the service you want to remove. In this example, we removed NTP and are in the process of removing OSPF.

    This grays out the chart title and removes the associated checkbox, temporarily removing the data related to that service from the table.

    View Details of a Particular Network Service

    From the Network Service Health tab on the large Network Health card you can click on a chart to take you to the full-screen card pre-focused on that service data.

    View Interfaces Health

    The interface health is a calculated average of the interfaces, VLAN, and MTU health metrics. You can view the overall health of interfaces from the medium Interface Health card and information about each component from the Interface Health tab on the large Interface Health card.

    To view information about each system component:

    1. Open the large Network Health card.

    2. Hover over the card and click .

      The health of each interface protocol or service is represented on the left side of the card by a distribution of the health score, a trend indicator, and a percentage of successful results. The right side of the card provides a listing of devices running the services.

    View Devices with the Most Issues

    It is useful to know which devices are experiencing the most issues with their interfaces in general, as this can help focus troubleshooting efforts toward selected devices versus the service itself. To view devices with the most issues, select Most Failures from the filter above the table on the right.

    Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Event cards and filter on the indicated switches.

    View Devices with Recent Issues

    It is useful to know which devices are experiencing the most issues with their network services right now, as this can help focus troubleshooting efforts toward selected devices versus the service itself. To view devices with recent issues, select Recent Failures from the filter above the table on the right. Devices with the highest number of issues are listed at the top. Scroll down to view those with fewer issues. To further investigate the critical devices, open the Switch card or the Event cards and filter on the indicated switches.

    Filter Results by Interface Service

    You can focus the data in the table on the right, by unselecting one or more services. Click the checkbox next to the interface item you want to remove from the data. In this example, we have unchecked MTU.

    This removes the checkbox next to the associated chart and grays out the title of the chart, temporarily removing the data related to that service from the table. Add it back by hovering over the chart and clicking the checkbox that appears.

    View Details of a Particular Interface Service

    From the Interface Health tab on the large Network Health card you can click on a chart to take you to the full-screen card pre-focused on that service data.

    View All Network Protocol and Service Validation Results

    The Network Health card workflow enables you to view all of the results of all validations run on the network protocols and services during the designated time period.

    To view all the validation results:

    1. Open the full screen Network Health card.

    2. Click <network protocol or service name> tab in the navigation panel.

    3. Look for patterns in the data. For example, when did nodes, sessions, links, ports, or devices start failing validation? Was it at a specific time? Was it when you starting running the service on more nodes? Did sessions fail, but nodes were fine?

    Where to go next depends on what data you see, but a few options include:

    Validate Network Protocol and Service Operations

    With the NetQ UI, you can validate the operation of the network protocols and services running in your network either on demand or on a scheduled basis. There are three card workflows to perform this validation: one for creating the validation request (either on-demand or scheduled) and two validation results (one for on-demand and one for scheduled).

    This release supports validation of the following network protocols and services: Agents, BGP, CLAG, EVPN, Interfaces, License, MTU, NTP, OSPF, Sensors, VLAN, and VXLAN.

    For a more general understanding of how well your network is operating, refer to the Monitor Network Health topic.

    Create Validation Requests

    The Validation Request card workflow is used to create on-demand validation requests to evaluate the health of your network protocols and services.

    Validation Request Card Workflow

    The small Validation Request card displays:

    ItemDescription
    Indicates a validation request
    Validation

    Select a scheduled request to run that request on-demand. A default validation is provided for each supported network protocol and service, which runs a network-wide validation check. These validations run every 60 minutes, but you may run them on-demand at any time.

    Note: No new requests can be configured from this size card.

    GOStart the validation request. The corresponding On-demand Validation Result cards are opened on your workbench, one per protocol and service.

    The medium Validation Request card displays:

    ItemDescription
    Indicates a validation request
    TitleValidation Request
    Validation

    Select a scheduled request to run that request on-demand. A default validation is provided for each supported network protocol and service, which runs a network-wide validation check. These validations run every 60 minutes, but you may run them on-demand at any time.

    Note: No new requests can be configured from this size card.

    ProtocolsThe protocols included in a selected validation request are listed here.
    ScheduleFor a selected scheduled validation, the schedule and the time of the last run are displayed.
    Start the validation requestRun Now

    The large Validation Request card displays:

    ItemDescription
    Indicates a validation request
    TitleValidation Request
    ValidationDepending on user intent, this field is used to:
    • Select a scheduled request to run that request on-demand. A default validation is provided for each supported network protocol and service, which runs a network-wide validation check. These validations run every 60 minutes, but you may run them on-demand at any time.
    • Leave as is to create a new scheduled validation request
    • Select a scheduled request to modify
    ProtocolsFor a selected scheduled validation, the protocols included in a validation request are listed here. For new on-demand or scheduled validations, click these to include them in the validation.
    Schedule:For a selected scheduled validation, the schedule and the time of the last run are displayed. For new scheduled validations, select the frequency and starting date and time.
    • Run Every: Select how often to run the request. Choose from 30 minutes, 1, 3, 6, or 12 hours, or 1 day.
    • Starting: Select the date and time to start the first request in the series
    • Last Run: Timestamp of when the selected validation was started
    Scheduled ValidationsCount of scheduled validations that are currently scheduled compared to the maximum of 15 allowed
    Run NowStart the validation request
    UpdateWhen changes are made to a selected validation request, Update becomes available so that you can save your changes.

    Be aware, that if you update a previously saved validation request, the historical data collected will no longer match the data results of future runs of the request. If your intention is to leave this request unchanged and create a new request, click Save As New instead.

    Save As NewWhen changes are made to a previously saved validation request, Save As New becomes available so that you can save the modified request as a new request.

    The full screen Validation Request card displays all scheduled validation requests.

    ItemDescription
    TitleValidation Request
    Closes full screen card and returns to workbench
    Default TimeNo time period is displayed for this card as each validation request has its own time relationship.
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    Validation RequestsDisplays all scheduled validation requests. By default, the requests list is sorted by the date and time that it was originally created (Created At). This tab provides the following additional data about each request:
    • Name: Text identifier of the validation
    • Type: Name of network protocols and/or services included in the validation
    • Start Time: Data and time that the validation request was run
    • Last Modified: Date and time of the most recent change made to the validation request
    • Cadence (Min): How often, in minutes, the validation is scheduled to run. This is empty for new on-demand requests.
    • Is Active: Indicates whether the request is currently running according to its schedule (true) or it is not running (false)
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    Create On-demand and Scheduled Validation Requests

    There are several types of validation requests that a user can make. Each has a slightly different flow through the Validation Request card, and is therefore described separately. The types are based on the intent of the request:

    Run an Existing Scheduled Validation Request On Demand

    You may find that although you have a validation scheduled to run at a later time, you would like to run it now.

    To run a scheduled validation now:

    1. Open either the small, medium, or large Validation Request card.

    2. Select the validation from the Validation dropdown list.

    3. Click Go or Run Now.
      The associated Validation Result card is opened on your workbench. Refer to View On-demand Validation Results.

    Create a New On-demand Validation Request

    When you want to validate the operation of one or more network protocols and services right now, you can create and run an on-demand validation request using the large Validation Request card.

    To create and run a request for a single protocol or service:

    1. Open the small, medium or large Validation Request card.

    2. Select the validation from the Validation dropdown list.

    3. Click Go or Run Now.
      The associated Validation Result card is opened on your workbench. Refer to View On-demand Validation Results.

    To create and run a request for more than one protocol and/or service:

    1. Open the large Validation Request card.

    2. Click the names of the protocols and services you want to validate. We selected BGP and EVPN in this example.

    3. Click Run Now to start the validation.
      The associated on-demand validation result cards (one per protocol or service selected) are opened on your current workbench. Refer to View On-demand Validation Results.

    Create a New Scheduled Validation Request

    When you want to see validation results on a regular basis, it is useful to configure a scheduled validation request to avoid re-creating the request each time.

    To create and run a new scheduled validation:

    1. Open the large Validation Request card.

    2. Select the protocols and/or services you want to include in the validation. In this example we have chosen the Agents and NTP services.

    3. Enter the schedule frequency (30 min, 1 hour, 3 hours, 6 hours, 12 hours, or 1 day) by selecting it from the Run every list. Default is hourly.

    4. Select the time to start the validation runs, by clicking in the Starting field. Select a day and click Next, then select the starting time and click OK.

    5. Verify the selections were made correctly.

    6. Click Save As New.

    7. Enter a name for the validation.

      Spaces and special characters are not allowed in validation request names.

    8. Click Save.

    The validation can now be selected from the Validation listing (on the small, medium or large size card) and run immediately using Run Now, or you can wait for it to run the first time according to the schedule you specified. Refer to View Scheduled Validation Results. Note that the number of scheduled validations is now two (2).

    Modify an Existing Scheduled Validation Request

    At some point you might want to change the schedule or validation types that are specified in a scheduled validation request.

    When you update a scheduled request, the results for all future runs of the validation will be different than the results of previous runs of the validation.

    To modify a scheduled validation:

    1. Open the large Validation Request card.
    2. Select the validation from the Validation dropdown list.
    3. Edit the schedule or validation types.
    4. Click Update.

    The validation can now be selected from the Validation listing (on the small, medium or large size card) and run immediately using Run Now, or you can wait for it to run the first time according to the schedule you specified. Refer to View Scheduled Validation Results.

    View On-demand Validation Results

    The On-demand Validation Result card workflow enables you to view the results of on-demand validation requests. When a request has started processing, the associated medium Validation Result card is displayed on your workbench. When multiple network protocols or services are included in a validation, a validation result card is opened for each protocol and service.

    On-Demand Validation Result Card Workflow

    The small Validation Result card displays:

    ItemDescription
    Indicates an on-demand validation result
    TitleOn-demand Result <Network Protocol or Service Name> Validation
    TimestampDate and time the validation was completed
    , Status of the validation job, where:
    • Good: Job ran successfully. One or more warnings may have occurred during the run.
    • Failed: Job encountered errors which prevented the job from completing, or job ran successfully, but errors occurred during the run.

    The medium Validation Result card displays:

    ItemDescription
    Indicates an on-demand validation result
    TitleOn-demand Validation Result | <Network Protocol or Service Name>
    TimestampDate and time the validation was completed
    , , Status of the validation job, where:
    • Good: Job ran successfully.
    • Warning: Job encountered issues, but it did complete its run.
    • Failed: Job encountered errors which prevented the job from completing.
    Devices TestedChart with the total number of devices included in the validation and the distribution of the results.
    • Pass: Number of devices tested that had successful results
    • Warn: Number of devices tested that had successful results, but also had at least one warning event
    • Fail: Number of devices tested that had one or more protocol or service failures

    Hover over chart to view the number of devices and the percentage of all tested devices for each result category.

    Sessions Tested

    For BGP, chart with total number of BGP sessions included in the validation and the distribution of the overall results.

    For EVPN, chart with total number of BGP sessions included in the validation and the distribution of the overall results.

    For Interfaces, chart with total number of ports included in the validation and the distribution of the overall results.

    In each of these charts:

    • Pass: Number of sessions or ports tested that had successful results
    • Warn: Number of sessions or ports tested that had successful results, but also had at least one warning event
    • Fail: Number of sessions or ports tested that had one or more failure events

    Hover over chart to view the number of devices, sessions, or ports and the percentage of all tested devices, sessions, or ports for each result category.

    This chart does not apply to other Network Protocols and Services, and thus is not displayed for those cards.

    Open <Service> CardClick to open the corresponding medium Network Services card, where available. Refer to Monitor Network Performance for details about these cards and workflows.

    The large Validation Result card contains two tabs.

    The Summary tab displays:

    ItemDescription
    Indicates an on-demand validation result
    TitleOn-demand Validation Result | Summary | <Network Protocol or Service Name>
    DateDay and time when the validation completed
    , , Status of the validation job, where:
    • Good: Job ran successfully.
    • Warning: Job encountered issues, but it did complete its run.
    • Failed: Job encountered errors which prevented the job from completing.
    Devices TestedChart with the total number of devices included in the validation and the distribution of the results.
    • Pass: Number of devices tested that had successful results
    • Warn: Number of devices tested that had successful results, but also had at least one warning event
    • Fail: Number of devices tested that had one or more protocol or service failures

    Hover over chart to view the number of devices and the percentage of all tested devices for each result category.

    Sessions Tested

    For BGP, chart with total number of BGP sessions included in the validation and the distribution of the overall results.

    For EVPN, chart with total number of BGP sessions included in the validation and the distribution of the overall results.

    For Interfaces, chart with total number of ports included in the validation and the distribution of the overall results.

    For OSPF, chart with total number of OSPF sessions included in the validation and the distribution of the overall results.

    In each of these charts:

    • Pass: Number of sessions or ports tested that had successful results
    • Warn: Number of sessions or ports tested that had successful results, but also had at least one warning event
    • Fail: Number of sessions or ports tested that had one or more failure events

    Hover over chart to view the number of devices, sessions, or ports and the percentage of all tested devices, sessions, or ports for each result category.

    This chart does not apply to other Network Protocols and Services, and thus is not displayed for those cards.

    Open <Service> CardClick to open the corresponding medium Network Services card, when available. Refer to Monitor Network Performance for details about these cards and workflows.
    Table/Filter options

    When the Most Active filter option is selected, the table displays switches and hosts running the given service or protocol in decreasing order of alarm counts. Devices with the largest number of warnings and failures are listed first. You can click on the device name to open its switch card on your workbench.

    When the Most Recent filter option is selected, the table displays switches and hosts running the given service or protocol sorted by timestamp, with the device with the most recent warning or failure listed first. The table provides the following additional information:

    • Hostname: User-defined name for switch or host
    • Message Type: Network protocol or service which triggered the event
    • Message: Short description of the event
    • Severity: Indication of importance of event; values in decreasing severity include critical, warning, error, info, debug
    Show All ResultsClick to open the full screen card with all on-demand validation results sorted by timestamp.

    The Configuration tab displays:

    ItemDescription
    Indicates an on-demand validation request configuration
    TitleOn-demand Validation Result | Configuration | <Network Protocol or Service Name>
    ValidationsList of network protocols or services included in the request that produced these results
    ScheduleNot relevant to on-demand validation results. Value is always N/A.

    The full screen Validation Result card provides a tab for all on-demand validation results.

    ItemDescription
    TitleValidation Results | On-demand
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    On-demand Validation Result | <network protocol or service>Displays all unscheduled validation results. By default, the results list is sorted by Timestamp. This tab provides the following additional data about each result:
    • Job ID: Internal identifier of the validation job that produced the given results
    • Timestamp: Date and time the validation completed
    • Type: Network protocol or service type
    • Total Node Count: Total number of nodes running the given network protocol or service
    • Checked Node Count: Number of nodes on which the validation ran
    • Failed Node Count: Number of checked nodes that had protocol or service failures
    • Rotten Node Count: Number of nodes that could not be reached during the validation
    • Unknown Node Count: Applies only to the Interfaces service. Number of nodes with unknown port states.
    • Failed Adjacent Count: Number of adjacent nodes that had protocol or service failures
    • Total Session Count: Total number of sessions running for the given network protocol or service
    • Failed Session Count: Number of sessions that had session failures
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View On-demand Validation Results

    Once an on-demand validation request has completed, the results are available in the corresponding Validation Result card.

    It may take a few minutes for all results to be presented if the load on the NetQ Platform is heavy at the time of the run.

    To view the results:

    1. Locate the medium on-demand Validation Result card on your workbench for the protocol or service that was run.

      You can identify it by the on-demand result icon, , protocol or service name, and the date and time that it was run.

      Note: You may have more than one card open for a given protocol or service, so be sure to use the date and time on the card to ensure you are viewing the correct card.

    2. Note the total number and distribution of results for the tested devices and sessions (when appropriate). Are there many failures?

    3. Hover over the charts to view the total number of warnings or failures and what percentage of the total results that represents for both devices and sessions.

    4. Switch to the large on-demand Validation Result card.

    5. If there are a large number of device warnings or failures, view the devices with the most issues in the table on the right. By default, this table displays the Most Active devices. Click on a device name to open its switch card on your workbench.

    6. To view the most recent issues, select Most Recent from the filter above the table.

    7. If there are a large number of devices or sessions with warnings or failures, the protocol or service may be experiencing issues. View the health of the protocol or service as a whole by clicking Open <network service> Card when available.

    8. To view all data available for all on-demand validation results for a given protocol, switch to the full screen card.

    9. Double-click in a given result row to open details about the validation.

      From this view you can:

      • See a summary of the validation results by clicking in the banner under the title. Toggle the arrow to close the summary.

      • See detailed results of each test run to validate the protocol or service. When errors or warnings are present, the nodes and relevant detail is provided.

      • Export the data by clicking Export.

      • Return to the validation jobs list by clicking .

      You may find that comparing various results gives you a clue as to why certain devices are experiencing more warnings or failures. For example, more failures occurred between certain times or on a particular device.

    View Scheduled Validation Results

    The Scheduled Validation Result card workflow enables you to view the results of scheduled validation requests. When a request has completed processing, you can access the Validation Result card from the full screen Validation Request card. Each protocol and service has its own validation result card, but the content is similar on each.

    Scheduled Validation Result Card Workflow Summary

    The small Scheduled Validation Result card displays:

    ItemDescription
    Indicates a scheduled validation result
    TitleScheduled Result <Network Protocol or Service Name> Validation
    ResultsSummary of validation results:
    • Number of validation runs completed in the designated time period
    • Number of runs with warnings
    • Number of runs with errors
    , Status of the validation job, where:
    • Pass: Job ran successfully. One or more warnings may have occurred during the run.
    • Failed: Job encountered errors which prevented the job from completing, or job ran successfully, but errors occurred during the run.

    The medium Scheduled Validation Result card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates a scheduled validation result
    TitleScheduled Validation Result | <Network Protocol or Service Name>
    SummarySummary of validation results:
    • Name of scheduled validation
    • Status of the validation job, where:
      • Pass: Job ran successfully. One or more warnings may have occurred during the run.
      • Failed: Job encountered errors which prevented the job from completing, or job ran successfully, but errors occurred during the run.
    ChartValidation results, where:
    • Time period: Range of time in which the data on the heat map was collected
    • Heat map: A time segmented view of the results. For each time segment, the color represents the percentage of warning, passing, and failed results. Refer to Validate Network Protocol and Service Operations for details on how to interpret the results.
    Open <Service> CardClick to open the corresponding medium Network Services card, when available. Refer to Monitor Network Performance for details about these cards and workflows.

    The large Scheduled Validation Result card contains two tabs.

    The Summary tab displays:

    ItemDescription
    Indicates a scheduled validation result
    TitleValidation Summary (Scheduled Validation Result | <Network Protocol or Service Name>)
    SummarySummary of validation results:
    • Name of scheduled validation
    • Status of the validation job, where:
      • Pass: Job ran successfully. One or more warnings may have occurred during the run.
      • Failed: Job encountered errors which prevented the job from completing, or job ran successfully, but errors occurred during the run.
    • Expand/Collapse: Expand the heat map to full width of card, collapse the heat map to the left
    ChartValidation results, where:
    • Time period: Range of time in which the data on the heat map was collected
    • Heat map: A time segmented view of the results. For each time segment, the color represents the percentage of warning, passing, and failed results. Refer to Validate Network Protocol and Service Operations for details on how to interpret the results.
    Open <Service> CardClick to open the corresponding medium Network Services card, when available. Refer to Monitor Network Performance for details about these cards and workflows.
    Table/Filter options

    When the Most Active filter option is selected, the table displays switches and hosts running the given service or protocol in decreasing order of alarm counts-devices with the largest number of warnings and failures are listed first.

    When the Most Recent filter option is selected, the table displays switches and hosts running the given service or protocol sorted by timestamp, with the device with the most recent warning or failure listed first. The table provides the following additional information:

    • Hostname: User-defined name for switch or host
    • Message Type: Network protocol or service which triggered the event
    • Message: Short description of the event
    • Severity: Indication of importance of event; values in decreasing severity include critical, warning, error, info, debug
    Show All ResultsClick to open the full screen card with all scheduled validation results sorted by timestamp.

    The Configuration tab displays:

    ItemDescription
    Indicates a scheduled validation configuration
    TitleConfiguration (Scheduled Validation Result | <Network Protocol or Service Name>)
    NameUser-defined name for this scheduled validation
    ValidationsList of validations included in the validation request that created this result
    ScheduleUser-defined schedule for the validation request that created this result
    Open Schedule CardOpens the large Validation Request card for editing this configuration

    The full screen Scheduled Validation Result card provides tabs for all scheduled validation results for the service.

    ItemDescription
    TitleScheduled Validation Results | <Network Protocol or Service>
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    Scheduled Validation Result | <network protocol or service>Displays all unscheduled validation results. By default, the results list is sorted by timestamp. This tab provides the following additional data about each result:
    • Job ID: Internal identifier of the validation job that produced the given results
    • Timestamp: Date and time the validation completed
    • Type: Protocol of Service Name
    • Total Node Count: Total number of nodes running the given network protocol or service
    • Checked Node Count: Number of nodes on which the validation ran
    • Failed Node Count: Number of checked nodes that had protocol or service failures
    • Rotten Node Count: Number of nodes that could not be reached during the validation
    • Unknown Node Count: Applies only to the Interfaces service. Number of nodes with unknown port states.
    • Failed Adjacent Count: Number of adjacent nodes that had protocol or service failures
    • Total Session Count: Total number of sessions running for the given network protocol or service
    • Failed Session Count: Number of sessions that had session failures
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    Granularity of Data Shown Based on Time Period

    On the medium and large Validation Result cards, the status of the runs is represented in heat maps stacked vertically; one for passing runs, one for runs with warnings, and one for runs with failures. Depending on the time period of data on the card, the number of smaller time blocks used to indicate the status varies. A vertical stack of time blocks, one from each map, includes the results from all checks during that time. The results are shown by how saturated the color is for each block. If all validations during that time period pass, then the middle block is 100% saturated (white) and the warning and failure blocks are zero % saturated (gray). As warnings and errors increase in saturation, the passing block is proportionally reduced in saturation. An example heat map for a time period of 24 hours is shown here with the most common time periods in the table showing the resulting time blocks and regions.

    Time PeriodNumber of RunsNumber Time BlocksAmount of Time in Each Block
    6 hours1861 hour
    12 hours36121 hour
    24 hours72241 hour
    1 week50471 day
    1 month2,086301 day
    1 quarter7,000131 week

    View Scheduled Validation Results

    Once a scheduled validation request has completed, the results are available in the corresponding Validation Result card.

    To view the results:

    1. Open the full size Validation Request card to view all scheduled validations.

    2. Select the validation results you want to view by clicking in the first column of the result and clicking the check box.

    3. On the Edit Menu that appears at the bottom of the window, click (Open Cards). This opens the medium Scheduled Validation Results card(s) for the selected items.

    4. Note the distribution of results. Are there many failures? Are they concentrated together in time? Has the protocol or service recovered after the failures?

    5. Hover over the heat maps to view the status numbers and what percentage of the total results that represents for a given region. The tooltip also shows the number of devices included in the validation and the number with warnings and/or failures. This is useful when you see the failures occurring on a small set of devices, as it might point to an issue with the devices rather than the network service.

    6. Optionally, click Open <network service> Card link to open the medium individual Network Services card. Your current card is not closed.

    7. Switch to the large Scheduled Validation card.

    8. Click to expand the chart.

    9. Collapse the heat map by clicking .

    10. If there are a large number of warnings or failures, view the devices with the most issues by clicking Most Active in the filter above the table. This might help narrow the failures down to a particular device or small set of devices that you can investigate further.

    11. Select the Most Recent filter above the table to see the events that have occurred in the near past at the top of the list.

    12. Optionally, view the health of the protocol or service as a whole by clicking Open <network service> Card (when available).

    13. You can view the configuration of the request that produced the results shown on this card workflow, by hovering over the card and clicking . If you want to change the configuration, click Edit Config to open the large Validation Request card, pre-populated with the current configuration. Follow the instructions in Modify an Existing Scheduled Validation Request to make your changes.

    14. To view all data available for all scheduled validation results for the given protocol or service, click Show All Results or switch to the full screen card.

    15. Look for changes and patterns in the results. Scroll to the right. Are there more failed sessions or nodes during one or more validations?

    16. Double-click in a given result row to open details about the validation.

      From this view you can:

      • See a summary of the validation results by clicking in the banner under the title. Toggle the arrow to close the summary.

      • See detailed results of each test run to validate the protocol or service. When errors or warnings are present, the nodes and relevant detail is provided.

      • Export the data by clicking Export.

      • Return to the validation jobs list by clicking .

      You may find that comparing various results gives you a clue as to why certain devices are experiencing more warnings or failures. For example, more failures occurred between certain times or on a particular device.

    Monitor Network Inventory

    With NetQ, a network administrator can monitor both the switch hardware and its operating system for misconfigurations or misbehaving services. The Devices Inventory card workflow provides a view into the switches and hosts installed in your network and their various hardware and software components. The workflow contains a small card with a count of each device type in your network, a medium card displaying the operating systems running on each set of devices, large cards with component information statistics, and full-screen cards displaying tables with attributes of all switches and all hosts in your network.

    The Devices Inventory card workflow helps answer questions such as:

    For monitoring inventory and performance on a switch-by-switch basis, refer to Monitor Switches.

    Devices Inventory Card Workflow Summary

    The small Devices Inventory card displays:

    ItemDescription
    Indicates data is for device inventory
    Total number of switches in inventory during the designated time period
    Total number of hosts in inventory during the designated time period

    The medium Devices Inventory card displays:

    ItemDescription
    Indicates data is for device inventory
    TitleInventory | Devices
    Total number of switches in inventory during the designated time period
    Total number of hosts in inventory during the designated time period
    ChartsDistribution of operating systems deployed on switches and hosts, respectively

    The large Devices Inventory card has one tab.

    The Switches tab displays:

    ItemDescription
    Time periodAlways Now for inventory by default
    Indicates data is for device inventory
    TitleInventory | Devices
    Total number of switches in inventory during the designated time period
    Link to full screen listing of all switches
    ComponentSwitch components monitored-ASIC, Operating System (OS), Cumulus Linux license, NetQ Agent version, and Platform
    Distribution chartsDistribution of switch components across the network
    UniqueNumber of unique items of each component type. For example, for License, you might have CL 2.7.2 and CL 2.7.4, giving you a unique count of two.

    The full screen Devices Inventory card provides tabs for all switches and all hosts.

    ItemDescription
    TitleInventory | Devices | Switches
    Closes full screen card and returns to workbench
    Time periodTime period does not apply to the Inventory cards. This is always Default Time.
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches and All Hosts tabsDisplays all monitored switches and hosts in your network. By default, the device list is sorted by hostname. These tabs provide the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.1.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples values include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.

    View the Number of Each Device Type in Your Network

    You can view the number of switches and hosts deployed in your network. As you grow your network this can be useful for validating that devices have been added as scheduled.

    To view the quantity of devices in your network, open the small Devices Inventory card.

    Chassis are not monitored in this release, so an N/A (not applicable) value is displayed for these devices, even if you have chassis in your network.

    View Which Operating Systems Are Running on Your Network Devices

    You can view the distribution of operating systems running on your switches and hosts. This is useful for verifying which versions of the OS are deployed and for upgrade planning. It also provides a view into the relative dependence on a given OS in your network.

    To view the OS distribution, open the medium Devices Inventory card if it is not already on your workbench.

    View Switch Components

    To view switch components, open the large Devices Inventory card. By default the Switches tab is shown displaying the total number of switches, ASIC vendor, OS versions, license status, NetQ Agent versions, and specific platforms deployed on all of your switches.

    Highlight a Selected Component Type

    You can hover over any of the segments in a component distribution chart to highlight a specific type of the given component. When you hover, a tooltip appears displaying:

    Additionally, sympathetic highlighting is used to show the related component types relevant to the highlighted segment and the number of unique component types associated with this type (shown in blue here).

    Focus on a Selected Component Type

    To dig deeper on a particular component type, you can filter the card data by that type. In this procedure, the result of filtering on the OS is shown.

    To view component type data:

    1. Click a segment of the component distribution charts.

    2. Select the first option from the popup, Filter <component name>. The card data is filtered to show only the components associated with selected component type. A filter tag appears next to the total number of switches indicating the filter criteria.

    3. Hover over the segments to view the related components.

    4. To return to the full complement of components, click the in the filter tag.

    While the Device Inventory cards provide a network-wide view, you may want to see more detail about your switch inventory. This can be found in the Switches Inventory card workflow. To open that workflow, click the Switch Inventory button at the top right of the Switches card.

    View All Switches

    You can view all stored attributes for all switches in your network. To view all switch details, open the full screen Devices Inventory card and click the All Switches tab in the navigation panel.

    To return to your workbench, click in the top right corner of the card.

    View All Hosts

    You can view all stored attributes for all hosts in your network. To view all hosts details, open the full screen Devices Inventory card and click the All Hosts tab in the navigation panel.

    To return to your workbench, click in the top right corner of the card.

    Monitor the BGP Service

    The Cumulus NetQ UI enables operators to view the health of the BGP service on a network-wide and a per session basis, giving greater insight into all aspects of the service. This is accomplished through two card workflows, one for the service and one for the session. They are described separately here.

    Monitor the BGP Service (All Sessions)

    With NetQ, you can monitor the number of nodes running the BGP service, view switches with the most established and unestablished BGP sessions, and view alarms triggered by the BGP service. For an overview and how to configure BGP to run in your data center network, refer to Border Gateway Protocol - BGP.

    BGP Service Card Workflow

    The small BGP Service card displays:

    ItemDescription
    Indicates data is for all sessions of a Network Service or Protocol
    TitleBGP: All BGP Sessions, or the BGP Service
    Total number of switches and hosts with the BGP service enabled during the designated time period
    Total number of BGP-related alarms received during the designated time period
    ChartDistribution of new BGP-related alarms received during the designated time period

    The medium BGP Service card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleNetwork Services | All BGP Sessions
    Total number of switches and hosts with the BGP service enabled during the designated time period
    Total number of BGP-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the BGP service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running BGP last week or last month might be more or less than the number of nodes running BGP currently.

    Total Open Alarms chart

    Distribution of BGP-related alarms received during the designated time period, and the total number of current BGP-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Total Nodes Not Est. chart

    Distribution of switches and hosts with unestablished BGP sessions during the designated time period, and the total number of unestablished sessions in the network currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of unestablished session last week or last month might be more of less than the number of nodes with unestablished sessions currently.

    The large BGP service card contains two tabs.

    The Sessions Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleSessions Summary (visible when you hover over card)
    Total number of switches and hosts with the BGP service enabled during the designated time period
    Total number of BGP-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the BGP service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running BGP last week or last month might be more or less than the number of nodes running BGP currently.

    Total Nodes Not Est. chart

    Distribution of switches and hosts with unestablished BGP sessions during the designated time period, and the total number of unestablished sessions in the network currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of unestablished session last week or last month might be more of less than the number of nodes with unestablished sessions currently.

    Table/Filter options

    When the Switches with Most Sessions filter option is selected, the table displays the switches and hosts running BGP sessions in decreasing order of session count-devices with the largest number of sessions are listed first

    When the Switches with Most Unestablished Sessions filter option is selected, the table switches and hosts running BGP sessions in decreasing order of unestablished sessions-devices with the largest number of unestablished sessions are listed first

    Show All SessionsLink to view data for all BGP sessions in the full screen card

    The Alarms tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    (in header)Indicates data is for all alarms for all BGP sessions
    TitleAlarms (visible when you hover over card)
    Total number of switches and hosts with the BGP service enabled during the designated time period
    (in summary bar)Total number of BGP-related alarms received during the designated time period
    Total Alarms chart

    Distribution of BGP-related alarms received during the designated time period, and the total number of current BGP-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Table/Filter optionsWhen the selected filter option is Switches with Most Alarms, the table displays switches and hosts running BGP in decreasing order of the count of alarms-devices with the largest number of BGP alarms are listed first
    Show All SessionsLink to view data for all BGP sessions in the full screen card

    The full screen BGP Service card provides tabs for all switches, all sessions, and all alarms.

    ItemDescription
    TitleNetwork Services | BGP
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches tabDisplays all switches and hosts running the BGP service. By default, the device list is sorted by hostname. This tab provides the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.2.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples values include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.
    All Sessions tabDisplays all BGP sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • ASN: Autonomous System Number, identifier for a collection of IP networks and routers. Example values include 633284,655435.
    • Conn Dropped: Number of dropped connections for a given session
    • Conn Estd: Number of connections established for a given session
    • DB State: Session state of DB
    • Evpn Pfx Rcvd: Address prefix received for EVPN traffic. Examples include 115, 35.
    • Ipv4, and Ipv6 Pfx Rcvd: Address prefix received for IPv4 or IPv6 traffic. Examples include 31, 14, 12.
    • Last Reset Time: Date and time at which the session was last established or reset
    • Objid: Object identifier for service
    • OPID: Customer identifier. This is always zero.
    • Peer
      • ASN: Autonomous System Number for peer device
      • Hostname: User-defined name for peer device
      • Name: Interface name or hostname of peer device
      • Router Id: IP address of router with access to the peer device
    • Reason: Text describing the cause of, or trigger for, an event
    • Rx and Tx Families: Address families supported for the receive and transmit session channels. Values include ipv4, ipv6, and evpn.
    • State: Current state of the session. Values include Established and NotEstd (not established).
    • Timestamp: Date and time session was started, deleted, updated or marked dead (device is down)
    • Upd8 Rx: Count of protocol messages received
    • Upd8 Tx: Count of protocol messages transmitted
    • Up Time: Number of seconds the session has been established, in EPOCH notation. Example: 1550147910000
    • Vrf: Name of the Virtual Route Forwarding interface. Examples: default, mgmt, DataVrf1081
    • Vrfid: Integer identifier of the VRF interface when used. Examples: 14, 25, 37
    All Alarms tabDisplays all BGP events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Source: Hostname of network device that generated the event
    • Message: Text description of a BGP-related event. Example: BGP session with peer tor-1 swp7 vrf default state changed from failed to Established
    • Type: Network protocol or service generating the event. This always has a value of bgp in this card workflow.
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Service Status Summary

    A summary of the BGP service is available from the Network Services card workflow, including the number of nodes running the service, the number of BGP-related alarms, and a distribution of those alarms.

    To view the summary, open the small BGP Service card.

    For more detail, select a different size BGP Service card.

    View the Distribution of Sessions and Alarms

    It is useful to know the number of network nodes running the BGP protocol over a period of time, as it gives you insight into the amount of traffic associated with and breadth of use of the protocol. It is also useful to compare the number of nodes running BGP with unestablished sessions with the alarms present at the same time to determine if there is any correlation between the issues and the ability to establish a BGP session.

    To view these distributions, open the medium BGP Service card.

    If a visual correlation is apparent, you can dig a little deeper with the large BGP Service card tabs.

    View Devices with the Most BGP Sessions

    You can view the load from BGP on your switches and hosts using the large Network Services card. This data enables you to see which switches are handling the most BGP traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most BGP sessions:

    1. Open the large BGP Service card.

    2. Select Switches With Most Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most BGP sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large BGP Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the original time. We chose Past Week for this example.

      You can now see whether there are significant differences between this time and the original time. If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running BGP than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Unestablished BGP Sessions

    You can identify switches and hosts that are experiencing difficulties establishing BGP sessions; both currently and in the past.

    To view switches with the most unestablished BGP sessions:

    1. Open the large BGP Service card.

    2. Select Switches with Most Unestablished Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most unestablished BGP sessions at the top. Scroll down to view those with the fewest unestablished sessions.

    Where to go next depends on what data you see, but a couple of options include:

    Switches or hosts experiencing a large number of BGP alarms may indicate a configuration or performance issue that needs further investigation. You can view the devices sorted by the number of BGP alarms and then use the Switches card workflow or the Alarms card workflow to gather more information about possible causes for the alarms.

    To view switches with the most BGP alarms:

    1. Open the large BGP Service card.

    2. Hover over the header and click .

    3. Select Switches with Most Alarms from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most BGP alarms at the top. Scroll down to view those with the fewest alarms.

    Where to go next depends on what data you see, but a few options include:

    View All BGP Events

    The BGP Network Services card workflow enables you to view all of the BGP events in the designated time period.

    To view all BGP events:

    1. Open the full screen BGP Service card.

    2. Click All Alarms tab in the navigation panel.

      By default, events are listed in most recent to least recent order.

    Where to go next depends on what data you see, but a couple of options include:

    To return to your workbench, click in the top right corner.

    View Details for All Devices Running BGP

    You can view all stored attributes of all switches and hosts running BGP in your network in the full screen card.

    To view all device details, open the full screen BGP Service card and click the All Switches tab.

    To return to your workbench, click in the top right corner.

    View Details for All BGP Sessions

    You can view all stored attributes of all BGP sessions in your network in the full-screen card.

    To view all session details, open the full screen BGP Service card and click the All Sessions tab.

    To return to your workbench, click in the top right corner.

    Use the icons above the table to select/deselect, filter, and export items in the list. Refer to Table Settings for more detail.

    To return to original display of results, click the associated tab.

    Monitor a Single BGP Session

    With NetQ, you can monitor a single session of the BGP service, view session state changes, and compare with alarms occurring at the same time, as well as monitor the running BGP configuration and changes to the configuration file. For an overview and how to configure BGP to run in your data center network, refer to Border Gateway Protocol - BGP.

    To access the single session cards, you must open the full screen BGP Service, click the All Sessions tab, select the desired session, then click (Open Cards).

    Granularity of Data Shown Based on Time Period

    On the medium and large single BGP session cards, the status of the sessions is represented in heat maps stacked vertically; one for established sessions, and one for unestablished sessions. Depending on the time period of data on the card, the number of smaller time blocks used to indicate the status varies. A vertical stack of time blocks, one from each map, includes the results from all checks during that time. The results are shown by how saturated the color is for each block. If all sessions during that time period were established for the entire time block, then the top block is 100% saturated (white) and the not established block is zero percent saturated (gray). As sessions that are not established increase in saturation, the sessions that are established block is proportionally reduced in saturation. An example heat map for a time period of 24 hours is shown here with the most common time periods in the table showing the resulting time blocks.

    Time PeriodNumber of RunsNumber Time BlocksAmount of Time in Each Block
    6 hours1861 hour
    12 hours36121 hour
    24 hours72241 hour
    1 week50471 day
    1 month2,086301 day
    1 quarter7,000131 week

    BGP Session Card Workflow Summary

    The small BGP Session card displays:

    ItemDescription
    Indicates data is for a single session of a Network Service or Protocol
    TitleBGP Session

    Hostnames of the two devices in a session. Arrow points from the host to the peer.
    , Current status of the session, either established or not established

    The medium BGP Session card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    TitleNetwork Services | BGP Session

    Hostnames of the two devices in a session. Arrow points in the direction of the session.
    , Current status of the session, either established or not established
    Time period for chartTime period for the chart data
    Session State Changes ChartHeat map of the state of the given session over the given time period. The status is sampled at a rate consistent with the time period. For example, for a 24 hour period, a status is collected every hour. Refer to Granularity of Data Shown Based on Time Period.
    Peer NameInterface name on or hostname for peer device
    Peer ASNAutonomous System Number for peer device
    Peer Router IDIP address of router with access to the peer device
    Peer HostnameUser-defined name for peer device

    The large BGP Session card contains two tabs.

    The Session Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    TitleSession Summary (Network Services | BGP Session)
    Summary bar

    Hostnames of the two devices in a session.

    Current status of the session-either established , or not established

    Session State Changes ChartHeat map of the state of the given session over the given time period. The status is sampled at a rate consistent with the time period. For example, for a 24 hour period, a status is collected every hour. Refer to Granularity of Data Shown Based on Time Period.
    Alarm Count ChartDistribution and count of BGP alarm events over the given time period.
    Info Count ChartDistribution and count of BGP info events over the given time period.
    Connection Drop CountNumber of times the session entered the not established state during the time period
    ASNAutonomous System Number for host device
    RX/TX FamiliesReceive and Transmit address types supported. Values include IPv4, IPv6, and EVPN.
    Peer HostnameUser-defined name for peer device
    Peer InterfaceInterface on which the session is connected
    Peer ASNAutonomous System Number for peer device
    Peer Router IDIP address of router with access to the peer device

    The Configuration File Evolution tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates configuration file information for a single session of a Network Service or Protocol
    Title(Network Services | BGP Session) Configuration File Evolution
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Click on to open associated device card.
    , Indication of host role, primary or secondary
    TimestampsWhen changes to the configuration file have occurred, the date and time are indicated. Click the time to see the changed file.
    Configuration File

    When File is selected, the configuration file as it was at the selected time is shown.

    When Diff is selected, the configuration file at the selected time is shown on the left and the configuration file at the previous timestamp is shown on the right. Differences are highlighted.

    Note: If no configuration file changes have been made, only the original file date is shown.

    The full screen BGP Session card provides tabs for all BGP sessions and all events.

    ItemDescription
    TitleNetwork Services | BGP
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All BGP Sessions tabDisplays all BGP sessions running on the host device. This tab provides the following additional data about each session:
    • ASN: Autonomous System Number, identifier for a collection of IP networks and routers. Example values include 633284,655435.
    • Conn Dropped: Number of dropped connections for a given session
    • Conn Estd: Number of connections established for a given session
    • DB State: Session state of DB
    • Evpn Pfx Rcvd: Address prefix for EVPN traffic. Examples include 115, 35.
    • Ipv4, and Ipv6 Pfx Rcvd: Address prefix for IPv4 or IPv6 traffic. Examples include 31, 14, 12.
    • Last Reset Time: Time at which the session was last established or reset
    • Objid: Object identifier for service
    • OPID: Customer identifier. This is always zero.
    • Peer
      • ASN: Autonomous System Number for peer device
      • Hostname: User-defined name for peer device
      • Name: Interface name or hostname of peer device
      • Router Id: IP address of router with access to the peer device
    • Reason: Event or cause of failure
    • Rx and Tx Families: Address families supported for the receive and transmit session channels. Values include ipv4, ipv6, and evpn.
    • State: Current state of the session. Values include Established and NotEstd (not established).
    • Timestamp: Date and time session was started, deleted, updated or marked dead (device is down)
    • Upd8 Rx: Count of protocol messages received
    • Upd8 Tx: Count of protocol messages transmitted
    • Up Time: Number of seconds the session has be established, in EPOC notation. Example: 1550147910000
    • Vrf: Name of the Virtual Route Forwarding interface. Examples: default, mgmt, DataVrf1081
    • Vrfid: Integer identifier of the VRF interface when used. Examples: 14, 25, 37
    All Events tabDisplays all events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a BGP-related event. Example: BGP session with peer tor-1 swp7 vrf default state changed from failed to Established
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of bgp in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Session Status Summary

    A summary of the BGP session is available from the BGP Session card workflow, showing the node and its peer and current status.

    To view the summary:

    1. Add the Network Services | All BGP Sessions card.

    2. Switch to the full screen card.

    3. Click the All Sessions tab.

    4. Double-click the session of interest. The full screen card closes automatically.

    5. Optionally, switch to the small BGP Session card.

    View BGP Session State Changes

    You can view the state of a given BGP session from the medium and large BGP Session Network Service cards. For a given time period, you can determine the stability of the BGP session between two devices. If you experienced connectivity issues at a particular time, you can use these cards to help verify the state of the session. If it was not established more than it was established, you can then investigate further into possible causes.

    To view the state transitions for a given BGP session, on the medium BGP Session card:

    1. Add the Network Services | All BGP Sessions card.

    2. Switch to the full screen card.

    3. Open the large BGP Service card.

    4. Click the All Sessions tab.

    5. Double-click the session of interest. The full screen card closes automatically.

    The heat map indicates the status of the session over the designated time period. In this example, the session has been established for the entire time period.

    From this card, you can also view the Peer ASN, name, hostname and router id identifying the session in more detail.

    To view the state transitions for a given BGP session on the large BGP Session card, follow the same steps to open the medium BGP Session card and then switch to the large card.

    From this card, you can view the alarm and info event counts, Peer ASN, hostname, and router id, VRF, and Tx/Rx families identifying the session in more detail. The Connection Drop Count gives you a sense of the session performance.

    View Changes to the BGP Service Configuration File

    Each time a change is made to the configuration file for the BGP service, NetQ logs the change and enables you to compare it with the last version. This can be useful when you are troubleshooting potential causes for alarms or sessions losing their connections.

    To view the configuration file changes:

    1. Open the large BGP Session card.

    2. Hover over the card and click to open the BGP Configuration File Evolution tab.

    3. Select the time of interest on the left; when a change may have impacted the performance. Scroll down if needed.

    4. Choose between the File view and the Diff view (selected option is dark; File by default).

      The File view displays the content of the file for you to review.

      The Diff view displays the changes between this version (on left) and the most recent version (on right) side by side. The changes are highlighted, as seen in this example.

    View All BGP Session Details

    You can view all stored attributes of all of the BGP sessions associated with the two devices on this card.

    To view all session details, open the full screen BGP Session card, and click the All BGP Sessions tab.

    To return to your workbench, click in the top right corner.

    View All Events

    You can view all of the alarm and info events for the two devices on this card.

    To view all events, open the full screen BGP Session card, and click the All Events tab.

    To return to your workbench, click in the top right corner.

    Monitor the EVPN Service

    The Cumulus NetQ UI enables operators to view the health of the EVPN service on a network-wide and a per session basis, giving greater insight into all aspects of the service. This is accomplished through two card workflows, one for the service and one for the session. They are described separately here.

    Monitor the EVPN Service (All Sessions)

    With NetQ, you can monitor the number of nodes running the EVPN service, view switches with the sessions, total number of VNIs, and alarms triggered by the EVPN service. For an overview and how to configure EVPN in your data center network, refer to Ethernet Virtual Private Network-EVPN.

    EVPN Service Card Workflow Summary

    The small EVPN Service card displays:

    ItemDescription
    Indicates data is for all sessions of a Network Service or Protocol
    TitleEVPN: All EVPN Sessions, or the EVPN Service
    Total number of switches and hosts with the EVPN service enabled during the designated time period
    Total number of EVPN-related alarms received during the designated time period
    ChartDistribution of EVPN-related alarms received during the designated time period

    The medium EVPN Service card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleNetwork Services | All EVPN Sessions
    Total number of switches and hosts with the EVPN service enabled during the designated time period
    Total number of EVPN-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the EVPN service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running EVPN last week or last month might be more or less than the number of nodes running EVPN currently.

    Total Open Alarms chart

    Distribution of EVPN-related alarms received during the designated time period, and the total number of current EVPN-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Total Sessions chartDistribution of EVPN sessions during the designated time period, and the total number of sessions running on the network currently.

    The large EVPN service card contains two tabs.

    The Sessions Summary tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleSessions Summary (visible when you hover over card)
    Total number of switches and hosts with the EVPN service enabled during the designated time period
    Total number of EVPN-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the EVPN service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running EVPN last week or last month might be more or less than the number of nodes running EVPN currently.

    Total Sessions chartDistribution of EVPN sessions during the designated time period, and the total number of sessions running on the network currently.
    Total L3 VNIs chartDistribution of layer 3 VXLAN Network Identifiers during this time period, and the total number of VNIs in the network currently.
    Table/Filter options

    When the Top Switches with Most Sessions filter is selected, the table displays devices running EVPN sessions in decreasing order of session count-devices with the largest number of sessions are listed first.

    When the Switches with Most L2 EVPN filter is selected, the table displays devices running layer 2 EVPN sessions in decreasing order of session count-devices with the largest number of sessions are listed first.

    When the Switches with Most L3 EVPN filter is selected, the table displays devices running layer 3 EVPN sessions in decreasing order of session count-devices with the largest number of sessions are listed first.

    Show All SessionsLink to view data for all EVPN sessions network-wide in the full screen card

    The Alarms tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    (in header)Indicates data is for all alarms for all sessions of a Network Service or Protocol
    TitleAlarms (visible when you hover over card)
    Total number of switches and hosts with the EVPN service enabled during the designated time period
    (in summary bar)Total number of EVPN-related alarms received during the designated time period
    Total Alarms chart

    Distribution of EVPN-related alarms received during the designated time period, and the total number of current BGP-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Table/Filter optionsWhen the Events by Most Active Device filter is selected, the table displays devices running EVPN sessions in decreasing order of alarm count-devices with the largest number of alarms are listed first
    Show All SessionsLink to view data for all EVPN sessions in the full screen card

    The full screen EVPN Service card provides tabs for all switches, all sessions, all alarms.

    ItemDescription
    TitleNetwork Services | EVPN
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches tabDisplays all switches and hosts running the EVPN service. By default, the device list is sorted by hostname. This tab provides the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.1.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.
    All Sessions tabDisplays all EVPN sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Adv All Vni: Indicates whether the VNI state is advertising all VNIs (true) or not (false)
    • Adv Gw Ip: Indicates whether the host device is advertising the gateway IP address (true) or not (false)
    • DB State: Session state of the DB
    • Export RT: IP address and port of the export route target used in the filtering mechanism for BGP route exchange
    • Import RT: IP address and port of the import route target used in the filtering mechanism for BGP route exchange
    • In Kernel: Indicates whether the associated VNI is in the kernel (in kernel) or not (not in kernel)
    • Is L3: Indicates whether the session is part of a layer 3 configuration (true) or not (false)
    • Origin Ip: Host device's local VXLAN tunnel IP address for the EVPN instance
    • OPID: LLDP service identifier
    • Rd: Route distinguisher used in the filtering mechanism for BGP route exchange
    • Timestamp: Date and time the session was started, deleted, updated or marked as dead (device is down)
    • Vni: Name of the VNI where session is running
    All Alarms tabDisplays all EVPN events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a EVPN-related event. Example: VNI 3 kernel state changed from down to up
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of evpn in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Service Status Summary

    A summary of the EVPN service is available from the Network Services card workflow, including the number of nodes running the service, the number of EVPN-related alarms, and a distribution of those alarms.

    To view the summary, open the small EVPN Network Service card.

    For more detail, select a different size EVPN Network Service card.

    View the Distribution of Sessions and Alarms

    It is useful to know the number of network nodes running the EVPN protocol over a period of time, as it gives you insight into the amount of traffic associated with and breadth of use of the protocol. It is also useful to compare the number of nodes running EVPN with the alarms present at the same time to determine if there is any correlation between the issues and the ability to establish an EVPN session.

    To view these distributions, open the medium EVPN Service card.

    If a visual correlation is apparent, you can dig a little deeper with the large EVPN Service card tabs.

    View the Distribution of Layer 3 VNIs

    It is useful to know the number of layer 3 VNIs, as it gives you insight into the complexity of the VXLAN.

    To view this distribution, open the large EVPN Service card and view the bottom chart on the left.

    View Devices with the Most EVPN Sessions

    You can view the load from EVPN on your switches and hosts using the large EVPN Service card. This data enables you to see which switches are handling the most EVPN traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most EVPN sessions:

    1. Open the large EVPN Service card.

    2. Select Top Switches with Most Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most EVPN sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large EVPN Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the current time.

      You can now see whether there are significant differences between this time period and the previous time period.

    If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running EVPN than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Layer 2 EVPN Sessions

    You can view the number layer 2 EVPN sessions on your switches and hosts using the large EVPN Service card. This data enables you to see which switches are handling the most EVPN traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most layer 2 EVPN sessions:

    1. Open the large EVPN Service card.

    2. Select Switches with Most L2 EVPN from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most layer 2 EVPN sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large EVPN Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the current time.

      You can now see whether there are significant differences between this time period and the previous time period.

    If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running EVPN than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Layer 3 EVPN Sessions

    You can view the number layer 3 EVPN sessions on your switches and hosts using the large EVPN Service card. This data enables you to see which switches are handling the most EVPN traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most layer 3 EVPN sessions:

    1. Open the large EVPN Service card.

    2. Select Switches with Most L3 EVPN from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most layer 3 EVPN sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large EVPN Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the current time.

      You can now see whether there are significant differences between this time period and the previous time period.

    If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running EVPN than previously, looking for changes in the topology, and so forth.

    Switches experiencing a large number of EVPN alarms may indicate a configuration or performance issue that needs further investigation. You can view the switches sorted by the number of BGP alarms and then use the Switches card workflow or the Alarms card workflow to gather more information about possible causes for the alarms.

    To view switches with the most EVPN alarms:

    1. Open the large EVPN Service card.

    2. Hover over the header and click .

    3. Select Events by Most Active Device from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most EVPN alarms at the top. Scroll down to view those with the fewest alarms.

    Where to go next depends on what data you see, but a few options include:

    View All EVPN Events

    The EVPN Service card workflow enables you to view all of the EVPN events in the designated time period.

    To view all EVPN events:

    1. Open the full screen EVPN Service card.

    2. Click All Alarms tab in the navigation panel. By default, events are sorted by Time, with most recent events listed first.

    Where to go next depends on what data you see, but a few options include:

    View Details for All Devices Running EVPN

    You can view all stored attributes of all switches running EVPN in your network in the full screen card.

    To view all switch and host details, open the full screen EVPN Service card, and click the All Switches tab.

    To return to your workbench, click at the top right.

    View Details for All EVPN Sessions

    You can view all stored attributes of all EVPN sessions in your network in the full screen card.

    To view all session details, open the full screen EVPN Service card, and click the All Sessions tab.

    To return to your workbench, click at the top right.

    Use the icons above the table to select/deselect, filter, and export items in the list. Refer to Table Settings for more detail.

    To return to original display of results, click the associated tab.

    Monitor a Single EVPN Session

    With NetQ, you can monitor the performance of a single EVPN session, including the number of associated VNI, VTEPs and type. For an overview and how to configure EVPN in your data center network, refer to Ethernet Virtual Private Network - EVPN.

    To access the single session cards, you must open the full screen EVPN Service, click the All Sessions tab, select the desired session, then click (Open Cards).

    EVPN Session Card Workflow Summary

    The small EVPN Session card displays:

    ItemDescription
    Indicates data is for an EVPN session
    TitleEVPN Session
    VNI NameName of the VNI (virtual network instance) used for this EVPN session
    Current VNI NodesTotal number of VNI nodes participating in the EVPN session currently

    The medium EVPN Session card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for an EVPN session
    TitleNetwork Services | EVPN Session
    Summary barVTEP (VXLAN Tunnel EndPoint) Count: Total number of VNI nodes participating in the EVPN session currently
    VTEP Count Over Time chartDistribution of VTEP counts during the designated time period
    VNI NameName of the VNI used for this EVPN session
    TypeIndicates whether the session is established as part of a layer 2 or layer 3 overlay network

    The large EVPN Session card contains two tabs.

    The Session Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for an EVPN session
    TitleSession Summary (Network Services | EVPN Session)
    Summary barVTEP (VXLAN Tunnel EndPoint) Count: Total number of VNI devices participating in the EVPN session currently
    VTEP Count Over Time chartDistribution of VTEPs during the designated time period
    Alarm Count chartDistribution of alarms during the designated time period
    Info Count chartDistribution of info events during the designated time period
    TableVRF (for layer 3) or VLAN (for layer 2) identifiers by device

    The Configuration File Evolution tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates configuration file information for a single session of a Network Service or Protocol
    Title(Network Services | EVPN Session) Configuration File Evolution
    VTEP count (currently)
    TimestampsWhen changes to the configuration file have occurred, the date and time are indicated. Click the time to see the changed file.
    Configuration File

    When File is selected, the configuration file as it was at the selected time is shown.

    When Diff is selected, the configuration file at the selected time is shown on the left and the configuration file at the previous timestamp is shown on the right. Differences are highlighted.

    Note: If no configuration file changes have been made, only the original file date is shown.

    The full screen EVPN Session card provides tabs for all EVPN sessions and all events.

    ItemDescription
    TitleNetwork Services | EVPN
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All EVPN Sessions tabDisplays all EVPN sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Adv All Vni: Indicates whether the VNI state is advertising all VNIs (true) or not (false)
    • Adv Gw Ip: Indicates whether the host device is advertising the gateway IP address (true) or not (false)
    • DB State: Session state of the DB
    • Export RT: IP address and port of the export route target used in the filtering mechanism for BGP route exchange
    • Import RT: IP address and port of the import route target used in the filtering mechanism for BGP route exchange
    • In Kernel: Indicates whether the associated VNI is in the kernel (in kernel) or not (not in kernel)
    • Is L3: Indicates whether the session is part of a layer 3 configuration (true) or not (false)
    • Origin Ip: Host device's local VXLAN tunnel IP address for the EVPN instance
    • OPID: LLDP service identifier
    • Rd: Route distinguisher used in the filtering mechanism for BGP route exchange
    • Timestamp: Date and time the session was started, deleted, updated or marked as dead (device is down)
    • Vni: Name of the VNI where session is running
    All Events tabDisplays all events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a EVPN-related event. Example: VNI 3 kernel state changed from down to up
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of evpn in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Session Status Summary

    A summary of the EVPN session is available from the EVPN Session card workflow, showing the node and its peer and current status.

    To view the summary:

    1. Add the Network Services | All EVPN Sessions card.

    2. Switch to the full screen card.

    3. Click the All Sessions tab.

    4. Double-click the session of interest. The full screen card closes automatically.

    5. Optionally, switch to the small EVPN Session card.

    For more detail, select a different size EVPN Session card.

    View VTEP Count

    You can view the count of VTEPs for a given EVPN session from the medium and large EVPN Session cards.

    To view the count for a given EVPN session, on the medium EVPN Session card:

    1. Add the Network Services | All EVPN Sessions card.

    2. Switch to the full screen card.

    3. Click the All Sessions tab.

    4. Double-click the session of interest. The full screen card closes automatically.

    To view the count for a given EVPN session on the large EVPN Session card, follow the same steps as for the medium card and then switch to the large card.

    View All EVPN Session Details

    You can view all stored attributes of all of the EVPN sessions running network-wide.

    To view all session details, open the full screen EVPN Session card and click the All EVPN Sessions tab.

    To return to your workbench, click in the top right of the card.

    View All Events

    You can view all of the alarm and info events occurring network wide.

    To view all events, open the full screen EVPN Session card and click the All Events tab.

    Where to go next depends on what data you see, but a few options include:

    Monitor the LLDP Service

    The Cumulus NetQ UI enables operators to view the health of the LLDP service on a network-wide and a per session basis, giving greater insight into all aspects of the service. This is accomplished through two card workflows, one for the service and one for the session. They are described separately here.

    Monitor the LLDP Service (All Sessions)

    With NetQ, you can monitor the number of nodes running the LLDP service, view nodes with the most LLDP neighbor nodes, those nodes with the least neighbor nodes, and view alarms triggered by the LLDP service. For an overview and how to configure LLDP in your data center network, refer to Link Layer Discovery Protocol.

    LLDP Service Card Workflow Summary

    The small LLDP Service card displays:

    ItemDescription
    Indicates data is for all sessions of a Network Service or Protocol
    TitleLLDP: All LLDP Sessions, or the LLDP Service
    Total number of switches with the LLDP service enabled during the designated time period
    Total number of LLDP-related alarms received during the designated time period
    ChartDistribution of LLDP-related alarms received during the designated time period

    The medium LLDP Service card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleLLDP: All LLDP Sessions, or the LLDP Service
    Total number of switches with the LLDP service enabled during the designated time period
    Total number of LLDP-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the LLDP service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running LLDP last week or last month might be more or less than the number of nodes running LLDP currently.

    Total Open Alarms chart

    Distribution of LLDP-related alarms received during the designated time period, and the total number of current LLDP-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Total Sessions chartDistribution of LLDP sessions running during the designated time period, and the total number of sessions running on the network currently.

    The large LLDP service card contains two tabs.

    The Sessions Summary tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleSessions Summary (Network Services | All LLDP Sessions)
    Total number of switches with the LLDP service enabled during the designated time period
    Total number of LLDP-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the LLDP service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running LLDP last week or last month might be more or less than the number of nodes running LLDP currently.

    Total Sessions chartDistribution of LLDP sessions running during the designated time period, and the total number of sessions running on the network currently
    Total Sessions with No Nbr chartDistribution of LLDP sessions missing neighbor information during the designated time period, and the total number of session missing neighbors in the network currently
    Table/Filter options

    When the Switches with Most Sessions filter is selected, the table displays switches running LLDP sessions in decreasing order of session count-devices with the largest number of sessions are listed first

    When the Switches with Most Unestablished Sessions filter is selected, the table displays switches running LLDP sessions in decreasing order of unestablished session count-devices with the largest number of unestablished sessions are listed first

    Show All SessionsLink to view all LLDP sessions in the full screen card

    The Alarms tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    (in header)Indicates data is all alarms for all LLDP sessions
    TitleAlarms (visible when you hover over card)
    Total number of switches with the LLDP service enabled during the designated time period
    (in summary bar)Total number of LLDP-related alarms received during the designated time period
    Total Alarms chart

    Distribution of LLDP-related alarms received during the designated time period, and the total number of current LLDP-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Table/Filter optionsWhen the Events by Most Active Device filter is selected, the table displays switches running LLDP sessions in decreasing order of alarm count-devices with the largest number of sessions are listed first
    Show All SessionsLink to view all LLDP sessions in the full screen card

    The full screen LLDP Service card provides tabs for all switches, all sessions, and all alarms.

    ItemDescription
    TitleNetwork Services | LLDP
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches tabDisplays all switches and hosts running the LLDP service. By default, the device list is sorted by hostname. This tab provides the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.1.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.
    All Sessions tabDisplays all LLDP sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Ifname: Name of the host interface where LLDP session is running
    • LLDP Peer:
      • Os: Operating system (OS) used by peer device. Values include Cumulus Linux, RedHat, Ubuntu, and CentOS.
      • Osv: Version of the OS used by peer device. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Bridge: Indicates whether the peer device is a bridge (true) or not (false)
      • Router: Indicates whether the peer device is a router (true) or not (false)
      • Station: Indicates whether the peer device is a station (true) or not (false)
    • Peer:
      • Hostname: User-defined name for the peer device
      • Ifname: Name of the peer interface where the session is running
    • Timestamp: Date and time that the session was started, deleted, updated, or marked dead (device is down)
    All Alarms tabDisplays all LLDP events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a LLDP-related event. Example: LLDP Session with host leaf02 swp6 modified fields leaf06 swp21
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of lldp in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Service Status Summary

    A summary of the LLDP service is available from the Network Services card workflow, including the number of nodes running the service, the number of LLDP-related alarms, and a distribution of those alarms.

    To view the summary, open the small LLDP Service card.

    In this example, there are no LLDP alarms present on the network of 14 devices.

    For more detail, select a different size LLDP Network Services card.

    View the Distribution of Nodes, Alarms, and Sessions

    It is useful to know the number of network nodes running the LLDP protocol over a period of time, as it gives you insight into nodes that might be misconfigured or experiencing communication issues. Additionally, if there are a large number of alarms, it is worth investigating either the service or particular devices.

    To view the distribution, open the medium LLDP Service card.

    In this example, we see that 13 nodes are running the LLDP protocol, that there are 52 sessions established, and that no LLDP-related alarms have occurred in the last 24 hours.

    View the Distribution of Missing Neighbors

    You can view the number of missing neighbors in any given time period and how that number has changed over time. This is a good indicator of link communication issues.

    To view the distribution, open the large LLDP Service card and view the bottom chart on the left, Total Sessions with No Nbr.

    In this example, we see that 16 of the 52 sessions are missing the neighbor (peer) device.

    View Devices with the Most LLDP Sessions

    You can view the load from LLDP on your switches using the large LLDP Service card. This data enables you to see which switches are handling the most LLDP traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most LLDP sessions:

    1. Open the large LLDP Service card.

    2. Select Switches with Most Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most LLDP sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large LLDP Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the current time. You can now see whether there are significant differences between this time period and the previous time period.

      In this case, notice that their are fewer nodes running the protocol, but the total number of sessions running has nearly doubled. If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running LLDP than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Unestablished LLDP Sessions

    You can identify switches that are experiencing difficulties establishing LLDP sessions; both currently and in the past.

    To view switches with the most unestablished LLDP sessions:

    1. Open the large LLDP Service card.

    2. Select Switches with Most Unestablished Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most unestablished CLAG sessions at the top. Scroll down to view those with the fewest unestablished sessions.

    Where to go next depends on what data you see, but a few options include:

    Switches experiencing a large number of LLDP alarms may indicate a configuration or performance issue that needs further investigation. You can view the switches sorted by the number of LLDP alarms and then use the Switches card workflow or the Alarms card workflow to gather more information about possible causes for the alarms.

    To view switches with most LLDP alarms:

    1. Open the large LLDP Service card.

    2. Hover over the header and click .

    3. Select Events by Most Active Device from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most BGP alarms at the top. Scroll down to view those with the fewest alarms.

    Where to go next depends on what data you see, but a few options include:

    View All LLDP Events

    The LLDP Network Services card workflow enables you to view all of the LLDP events in the designated time period.

    To view all LLDP events:

    1. Open the full screen LLDP Service card.

    2. Click the All Alarms tab.

    Where to go next depends on what data you see, but a few options include:

    View Details About All Switches Running LLDP

    You can view all stored attributes of all switches running LLDP in your network in the full screen card.

    To view all switch details, open the LLDP Service card, and click the All Switches tab.

    Return to your workbench by clicking in the top right corner.

    View Detailed Information About All LLDP Sessions

    You can view all stored attributes of all LLDP sessions in your network in the full screen card.

    To view all session details, open the LLDP Service card, and click the All Sessions tab.

    Return to your workbench by clicking in the top right corner.

    Use the icons above the table to select/deselect, filter, and export items in the list. Refer to Table Settings for more detail. To return to original display of results, click the associated tab.

    Monitor a Single LLDP Session

    With NetQ, you can monitor the number of nodes running the LLDP service, view neighbor state changes, and compare with events occurring at the same time, as well as monitor the running LLDP configuration and changes to the configuration file. For an overview and how to configure LLDP in your data center network, refer to Link Layer Discovery Protocol.

    To access the single session cards, you must open the full screen LLDP Service card, click the All Sessions tab, select the desired session, then click (Open Cards).

    Granularity of Data Shown Based on Time Period

    On the medium and large single LLDP session cards, the status of the neighboring peers is represented in heat maps stacked vertically; one for peers that are reachable (neighbor detected), and one for peers that are unreachable (neighbor not detected). Depending on the time period of data on the card, the number of smaller time blocks used to indicate the status varies. A vertical stack of time blocks, one from each map, includes the results from all checks during that time. The results are shown by how saturated the color is for each block. If all peers during that time period were detected for the entire time block, then the top block is 100% saturated (white) and the neighbor not detected block is zero percent saturated (gray). As peers become reachable, the neighbor detected block increases in saturation, the peers that are unreachable (neighbor not detected) block is proportionally reduced in saturation. An example heat map for a time period of 24 hours is shown here with the most common time periods in the table showing the resulting time blocks.

    Time PeriodNumber of RunsNumber Time BlocksAmount of Time in Each Block
    6 hours1861 hour
    12 hours36121 hour
    24 hours72241 hour
    1 week50471 day
    1 month2,086301 day
    1 quarter7,000131 week

    LLDP Session Card Workflow Summary

    The small LLDP Session card displays:

    ItemDescription
    Indicates data is for a single session of a Network Service or Protocol
    TitleLLDP Session
    Host and peer devices in session. Host is shown on top, with peer below.
    , Indicates whether the host sees the peer or not; has a peer, no peer

    The medium LLDP Session card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected
    Indicates data is for a single session of a Network Service or Protocol
    TitleLLDP Session
    Host and peer devices in session. Arrow points from host to peer.
    , Indicates whether the host sees the peer or not; has a peer, no peer
    Time periodRange of time for the distribution chart
    Heat mapDistribution of neighbor availability (detected or undetected) during this given time period
    HostnameUser-defined name of the host device
    Interface NameSoftware interface on the host device where the session is running
    Peer HostnameUser-defined name of the peer device
    Peer Interface NameSoftware interface on the peer where the session is running

    The large LLDP Session card contains two tabs.

    The Session Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected
    Indicates data is for a single session of a Network Service or Protocol
    TitleSummary Session (Network Services | LLDP Session)
    Host and peer devices in session. Arrow points from host to peer.
    , Indicates whether the host sees the peer or not; has a peer, no peer
    Heat mapDistribution of neighbor state (detected or undetected) during this given time period
    Alarm Count chartDistribution and count of LLDP alarm events during the given time period
    Info Count chartDistribution and count of LLDP info events during the given time period
    Host Interface NameSoftware interface on the host where the session is running
    Peer HostnameUser-defined name of the peer device
    Peer Interface NameSoftware interface on the peer where the session is running

    The Configuration File Evolution tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates configuration file information for a single session of a Network Service or Protocol
    Title(Network Services | LLDP Session) Configuration File Evolution
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Click to open associated device card.
    , Indicates whether the host sees the peer or not; has a peer, no peer
    TimestampsWhen changes to the configuration file have occurred, the date and time are indicated. Click the time to see the changed file.
    Configuration File

    When File is selected, the configuration file as it was at the selected time is shown. When Diff is selected, the configuration file at the selected time is shown on the left and the configuration file at the previous timestamp is shown on the right. Differences are highlighted.

    Note: If no configuration file changes have been made, the card shows no results.

    The full screen LLDP Session card provides tabs for all LLDP sessions and all events.

    ItemDescription
    TitleNetwork Services | LLDP
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All LLDP Sessions tabDisplays all LLDP sessions on the host device. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Ifname: Name of the host interface where LLDP session is running
    • LLDP Peer:
      • Os: Operating system (OS) used by peer device. Values include Cumulus Linux, RedHat, Ubuntu, and CentOS.
      • Osv: Version of the OS used by peer device. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Bridge: Indicates whether the peer device is a bridge (true) or not (false)
      • Router: Indicates whether the peer device is a router (true) or not (false)
      • Station: Indicates whether the peer device is a station (true) or not (false)
    • Peer:
      • Hostname: User-defined name for the peer device
      • Ifname: Name of the peer interface where the session is running
    • Timestamp: Date and time that the session was started, deleted, updated, or marked dead (device is down)
    All Events tabDisplays all events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of an event. Example: LLDP Session with host leaf02 swp6 modified fields leaf06 swp21
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of lldp in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Session Status Summary

    A summary of the LLDP session is available from the LLDP Session card workflow, showing the node and its peer and current status.

    To view the summary:

    1. Open the full screen LLDP Service card.

    2. Double-click on a session. The full screen card closes automatically.

    3. Locate the medium LLDP Session card.

    4. Optionally, open the small LLDP Session card.

    View LLDP Session Neighbor State Changes

    You can view the neighbor state for a given LLDP session from the medium and large LLDP Session cards. For a given time period, you can determine the stability of the LLDP session between two devices. If you experienced connectivity issues at a particular time, you can use these cards to help verify the state of the neighbor. If the neighbor was not alive more than it was alive, you can then investigate further into possible causes.

    To view the neighbor availability for a given LLDP session on the medium card:

    1. Open the full screen LLDP Service card.

    2. Double-click on a session. The full screen card closes automatically.

    3. Locate the medium LLDP Session card.

    In this example, the heat map tells us that this LLDP session has been able to detect a neighbor for the entire time period.

    From this card, you can also view the host name and interface name, and the peer name and interface name.

    To view the neighbor availability for a given LLDP session on the large LLDP Session card, open that card.

    From this card, you can also view the alarm and info event counts, host interface name, peer hostname, and peer interface identifying the session in more detail.

    View Changes to the LLDP Service Configuration File

    Each time a change is made to the configuration file for the LLDP service, NetQ logs the change and enables you to compare it with the last version. This can be useful when you are troubleshooting potential causes for alarms or sessions losing their connections.

    To view the configuration file changes:

    1. Open the large LLDP Session card.

    2. Hover over the card and click to open the LLDP Configuration File Evolution tab.

    3. Select the time of interest on the left; when a change may have impacted the performance. Scroll down if needed.

    4. Choose between the File view and the Diff view (selected option is dark; File by default).

      The File view displays the content of the file for you to review.

      The Diff view displays the changes between this version (on left) and the most recent version (on right) side by side. The changes are highlighted in red and green. In this example, we don’t have any changes to the file, so the same file is shown on both sides, and thus no highlighted lines.

    View All LLDP Session Details

    You can view all stored attributes of all of the LLDP sessions associated with the two devices on this card.

    To view all session details, open the full screen LLDP Session card, and click the All LLDP Sessions tab.

    To return to your workbench, click in the top right of the card.

    View All Events

    You can view all of the alarm and info events in the network.

    To view all events, open the full screen LLDP Session card, and click the All Events tab.

    Where to go next depends on what data you see, but a few options include:

    Monitor the MLAG Service

    The Cumulus NetQ UI enables operators to view the health of the MLAG service on a network-wide and a per session basis, giving greater insight into all aspects of the service. This is accomplished through two card workflows, one for the service and one for the session. They are described separately here.

    MLAG or CLAG? The Cumulus Linux implementation of MLAG is referred to by other vendors as MLAG, MC-LAG or VPC. The Cumulus NetQ UI uses the MLAG terminology predominantly.

    Monitor the MLAG Service (All Sessions)

    With NetQ, you can monitor the number of nodes running the MLAG service, view sessions running, and view alarms triggered by the MLAG service. For an overview and how to configure MLAG in your data center network, refer to Multi-Chassis Link Aggregation - MLAG.

    MLAG Service Card Workflow Summary

    The small MLAG Service card displays:

    ItemDescription
    Indicates data is for all sessions of a Network Service or Protocol
    TitleMLAG: All MLAG Sessions, or the MLAG Service
    Total number of switches with the MLAG service enabled during the designated time period
    Total number of MLAG-related alarms received during the designated time period
    ChartDistribution of MLAG-related alarms received during the designated time period

    The medium MLAG Service card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleNetwork Services | All MLAG Sessions
    Total number of switches with the MLAG service enabled during the designated time period
    Total number of MLAG-related alarms received during the designated time period
    Total number of sessions with an inactive backup IP address during the designated time period
    Total number of bonds with only a single connection during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the MLAG service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running MLAG last week or last month might be more or less than the number of nodes running MLAG currently.

    Total Open Alarms chart

    Distribution of MLAG-related alarms received during the designated time period, and the total number of current MLAG-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Total Sessions chartDistribution of MLAG sessions running during the designated time period, and the total number of sessions running on the network currently

    The large MLAG service card contains two tabs.

    The All MLAG Sessions summary tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleAll MLAG Sessions Summary
    Total number of switches with the MLAG service enabled during the designated time period
    Total number of MLAG-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the MLAG service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running MLAG last week or last month might be more or less than the number of nodes running MLAG currently.

    Total Sessions chart

    Distribution of MLAG sessions running during the designated time period, and the total number of sessions running on the network currently

    Total Sessions with Inactive-backup-ip chartDistribution of sessions without an active backup IP defined during the designated time period, and the total number of these sessions running on the network currently
    Table/Filter options

    When the Switches with Most Sessions filter is selected, the table displays switches running MLAG sessions in decreasing order of session count-devices with the largest number of sessions are listed first

    When the Switches with Most Unestablished Sessions filter is selected, the table displays switches running MLAG sessions in decreasing order of unestablished session count-devices with the largest number of unestablished sessions are listed first

    Show All SessionsLink to view all MLAG sessions in the full screen card

    The All MLAG Alarms tab which displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    (in header)Indicates alarm data for all MLAG sessions
    TitleNetwork Services | All MLAG Alarms (visible when you hover over card)
    Total number of switches with the MLAG service enabled during the designated time period
    (in summary bar)Total number of MLAG-related alarms received during the designated time period
    Total Alarms chart

    Distribution of MLAG-related alarms received during the designated time period, and the total number of current MLAG-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Table/Filter optionsWhen the Events by Most Active Device filter is selected, the table displays switches running MLAG sessions in decreasing order of alarm count-devices with the largest number of sessions are listed first
    Show All SessionsLink to view all MLAG sessions in the full screen card

    The full screen MLAG Service card provides tabs for all switches, all sessions, and all alarms.

    ItemDescription
    TitleNetwork Services | MLAG
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches tabDisplays all switches and hosts running the MLAG service. By default, the device list is sorted by hostname. This tab provides the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.1.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples values include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.
    All Sessions tabDisplays all MLAG sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Backup Ip: IP address of the interface to use if the peerlink (or bond) goes down
    • Backup Ip Active: Indicates whether the backup IP address has been specified and is active (true) or not (false)
    • Bonds
      • Conflicted: Identifies the set of interfaces in a bond that do not match on each end of the bond
      • Single: Identifies a set of interfaces connecting to only one of the two switches
      • Dual: Identifies a set of interfaces connecting to both switches
      • Proto Down: Interface on the switch brought down by the clagd service. Value is blank if no interfaces are down due to clagd service.
    • Clag Sysmac: Unique MAC address for each bond interface pair. Note: Must be a value between 44:38:39:ff:00:00 and 44:38:39:ff:ff:ff.
    • Peer:
      • If: Name of the peer interface
      • Role: Role of the peer device. Values include primary and secondary.
      • State: Indicates if peer device is up (true) or down (false)
    • Role: Role of the host device. Values include primary and secondary.
    • Timestamp: Date and time the MLAG session was started, deleted, updated, or marked dead (device went down)
    • Vxlan Anycast: Anycast IP address used for VXLAN termination
    All Alarms tabDisplays all MLAG events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a MLAG-related event. Example: Clag conflicted bond changed from swp7 swp8 to swp9 swp10
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of clag in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Service Status Summary

    A summary of the MLAG service is available from the MLAG Service card workflow, including the number of nodes running the service, the number of MLAG-related alarms, and a distribution of those alarms.

    To view the summary, open the small MLAG Service card.

    For more detail, select a different size MLAG Service card.

    View the Distribution of Sessions and Alarms

    It is useful to know the number of network nodes running the MLAG protocol over a period of time, as it gives you insight into the amount of traffic associated with and breadth of use of the protocol. It is also useful to compare the number of nodes running MLAG with the alarms present at the same time to determine if there is any correlation between the issues and the ability to establish a MLAG session.

    To view these distributions, open the medium MLAG Service card.

    If a visual correlation is apparent, you can dig a little deeper with the large MLAG Service card tabs.

    View Devices with the Most CLAG Sessions

    You can view the load from MLAG on your switches using the large MLAG Service card. This data enables you to see which switches are handling the most MLAG traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most MLAG sessions:

    1. Open the large MLAG Service card.

    2. Select Switches with Most Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most MLAG sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large MLAG Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the current time. You can now see whether there are significant differences between this time period and the previous time period.

      If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running MLAG than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Unestablished MLAG Sessions

    You can identify switches that are experiencing difficulties establishing MLAG sessions; both currently and in the past.

    To view switches with the most unestablished MLAG sessions:

    1. Open the large MLAG Service card.

    2. Select Switches with Most Unestablished Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most unestablished MLAG sessions at the top. Scroll down to view those with the fewest unestablished sessions.

    Where to go next depends on what data you see, but a few options include:

    Switches experiencing a large number of MLAG alarms may indicate a configuration or performance issue that needs further investigation. You can view the switches sorted by the number of MLAG alarms and then use the Switches card workflow or the Alarms card workflow to gather more information about possible causes for the alarms.

    To view switches with most MLAG alarms:

    1. Open the large MLAG Service card.

    2. Hover over the header and click .

    3. Select Events by Most Active Device from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most MLAG alarms at the top. Scroll down to view those with the fewest alarms.

    Where to go next depends on what data you see, but a few options include:

    View All MLAG Events

    The MLAG Service card workflow enables you to view all of the MLAG events in the designated time period.

    To view all MLAG events:

    1. Open the full screen MLAG Service card.

    2. Click All Alarms tab.

    Where to go next depends on what data you see, but a few options include:

    View Details About All Switches Running MLAG

    You can view all stored attributes of all switches running MLAG in your network in the full-screen card.

    To view all switch details, open the full screen MLAG Service card, and click the All Switches tab.

    To return to your workbench, click in the top right corner.

    Use the icons above the table to select/deselect, filter, and export items in the list. Refer to Table Settings for more detail. To return to original display of results, click the associated tab.

    Monitor a Single MLAG Session

    With NetQ, you can monitor the number of nodes running the MLAG service, view switches with the most peers alive and not alive, and view alarms triggered by the MLAG service. For an overview and how to configure MLAG in your data center network, refer to Multi-Chassis Link Aggregation - MLAG.

    To access the single session cards, you must open the full screen MLAG Service, click the All Sessions tab, select the desired session, then click (Open Cards).

    Granularity of Data Shown Based on Time Period

    On the medium and large single MLAG session cards, the status of the peers is represented in heat maps stacked vertically; one for peers that are reachable (alive), and one for peers that are unreachable (not alive). Depending on the time period of data on the card, the number of smaller time blocks used to indicate the status varies. A vertical stack of time blocks, one from each map, includes the results from all checks during that time. The results are shown by how saturated the color is for each block. If all peers during that time period were alive for the entire time block, then the top block is 100% saturated (white) and the not alive block is zero percent saturated (gray). As peers that are not alive increase in saturation, the peers that are alive block is proportionally reduced in saturation. An example heat map for a time period of 24 hours is shown here with the most common time periods in the table showing the resulting time blocks.

    Time PeriodNumber of RunsNumber Time BlocksAmount of Time in Each Block
    6 hours1861 hour
    12 hours36121 hour
    24 hours72241 hour
    1 week50471 day
    1 month2,086301 day
    1 quarter7,000131 week

    MLAG Session Card Workflow Summary

    The small MLAG Session card displays:

    ItemDescription
    Indicates data is for a single session of a Network Service or Protocol
    TitleCLAG Session
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session.
    , Indication of host role, primary or secondary

    The medium MLAG Session card displays:

    ItemDescription
    Time period (in header)Range of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    TitleNetwork Services | MLAG Session
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Arrow points from the host to the peer. Click to open associated device card.
    , Indication of host role, primary or secondary
    Time period (above chart)Range of time for data displayed in peer status chart
    Peer Status chartDistribution of peer availability, alive or not alive, during the designated time period. The number of time segments in a time period varies according to the length of the time period.
    RoleRole that host device is playing. Values include primary and secondary.
    CLAG sysmacSystem MAC address of the MLAG session
    Peer RoleRole that peer device is playing. Values include primary and secondary.
    Peer StateOperational state of the peer, up (true) or down (false)

    The large MLAG Session card contains two tabs.

    The Session Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    Title(Network Services | MLAG Session) Session Summary
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Arrow points from the host to the peer. Click to open associated device card.
    , Indication of host role, primary or secondary
    Alarm Count ChartDistribution and count of CLAG alarm events over the given time period
    Info Count ChartDistribution and count of CLAG info events over the given time period
    Peer Status chartDistribution of peer availability, alive or not alive, during the designated time period. The number of time segments in a time period varies according to the length of the time period.
    Backup IPIP address of the interface to use if the peerlink (or bond) goes down
    Backup IP ActiveIndicates whether the backup IP address is configured
    CLAG SysMACSystem MAC address of the MLAG session
    Peer StateOperational state of the peer, up (true) or down (false)
    Count of Dual BondsNumber of bonds connecting to both switches
    Count of Single BondsNumber of bonds connecting to only one switch
    Count of Protocol Down BondsNumber of bonds with interfaces that were brought down by the clagd service
    Count of Conflicted BondsNumber of bonds which have a set of interfaces that are not the same on both switches

    The Configuration File Evolution tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates configuration file information for a single session of a Network Service or Protocol
    Title(Network Services | MLAG Session) Configuration File Evolution
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Arrow points from the host to the peer. Click to open associated device card.
    , Indication of host role, primary or secondary
    TimestampsWhen changes to the configuration file have occurred, the date and time are indicated. Click the time to see the changed file.
    Configuration File

    When File is selected, the configuration file as it was at the selected time is shown.

    When Diff is selected, the configuration file at the selected time is shown on the left and the configuration file at the previous timestamp is shown on the right. Differences are highlighted.

    The full screen MLAG Session card provides tabs for all MLAG sessions and all events.

    ItemDescription
    TitleNetwork Services | MLAG
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All MLAG Sessions tabDisplays all MLAG sessions for the given session. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Backup Ip: IP address of the interface to use if the peerlink (or bond) goes down
    • Backup Ip Active: Indicates whether the backup IP address has been specified and is active (true) or not (false)
    • Bonds
      • Conflicted: Identifies the set of interfaces in a bond that do not match on each end of the bond
      • Single: Identifies a set of interfaces connecting to only one of the two switches
      • Dual: Identifies a set of interfaces connecting to both switches
      • Proto Down: Interface on the switch brought down by the clagd service. Value is blank if no interfaces are down due to clagd service.
    • Mlag Sysmac: Unique MAC address for each bond interface pair. Note: Must be a value between 44:38:39:ff:00:00 and 44:38:39:ff:ff:ff.
    • Peer:
      • If: Name of the peer interface
      • Role: Role of the peer device. Values include primary and secondary.
      • State: Indicates if peer device is up (true) or down (false)
    • Role: Role of the host device. Values include primary and secondary.
    • Timestamp: Date and time the MLAG session was started, deleted, updated, or marked dead (device went down)
    • Vxlan Anycast: Anycast IP address used for VXLAN termination
    All Events tabDisplays all events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of an event. Example: Clag conflicted bond changed from swp7 swp8 to swp9 swp10
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of clag in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Session Status Summary

    A summary of the MLAG session is available from the MLAG Session card workflow, showing the node and its peer and current status.

    To view the summary:

    1. Open the full screen MLAG Service card.

    2. Select a session from the listing to view.

    3. Close the full screen card to view the medium MLAG Session card.

      In the left example, we see that the tor1 switch plays the secondary role in this session with the switch at 44:38:39:ff:01:01. In the right example, we see that the leaf03 switch plays the primary role in this session with leaf04.

    View MLAG Session Peering State Changes

    You can view the peering state for a given MLAG session from the medium and large MLAG Session cards. For a given time period, you can determine the stability of the MLAG session between two devices. If you experienced connectivity issues at a particular time, you can use these cards to help verify the state of the peer. If the peer was not alive more than it was alive, you can then investigate further into possible causes.

    To view the state transitions for a given MLAG session:

    1. Open the full screen MLAG Service card.

    2. Select a session from the listing to view.

    3. Close the full screen card to view the medium MLAG Session card.

      In this example, the peer switch has been alive for the entire 24-hour period.

    From this card, you can also view the node role, peer role and state, and MLAG system MAC address which identify the session in more detail.

    To view the peering state transitions for a given MLAG session on the large MLAG Session card, open that card.

    From this card, you can also view the alarm and info event counts, node role, peer role, state, and interface, MLAG system MAC address, active backup IP address, single, dual, conflicted, and protocol down bonds, and the VXLAN anycast address identifying the session in more detail.

    View Changes to the MLAG Service Configuration File

    Each time a change is made to the configuration file for the MLAG service, NetQ logs the change and enables you to compare it with the last version. This can be useful when you are troubleshooting potential causes for alarms or sessions losing their connections.

    To view the configuration file changes:

    1. Open the large MLAG Session card.

    2. Hover over the card and click to open the Configuration File Evolution tab.

    3. Select the time of interest on the left; when a change may have impacted the performance. Scroll down if needed.

    4. Choose between the File view and the Diff view (selected option is dark; File by default).

      The File view displays the content of the file for you to review.

      The Diff view displays the changes between this version (on left) and the most recent version (on right) side by side. The changes are highlighted in red and green. In this example, we don’t have any changes after this first creation, so the same file is shown on both sides and no highlighting is present.

    All MLAG Session Details

    You can view all stored attributes of all of the MLAG sessions associated with the two devices on this card.

    To view all session details, open the full screen MLAG Session card, and click the All MLAG Sessions tab.

    Where to go next depends on what data you see, but a few options include:

    View All MLAG Session Events

    You can view all of the alarm and info events for the two devices on this card.

    To view all events, open the full screen MLAG Session card, and click the All Events tab.

    Where to go next depends on what data you see, but a few options include:

    Monitor the OSPF Service

    The Cumulus NetQ UI enables operators to view the health of the OSPF service on a network-wide and a per session basis, giving greater insight into all aspects of the service. This is accomplished through two card workflows, one for the service and one for the session. They are described separately here.

    Monitor the OSPF Service (All Sessions)

    With NetQ, you can monitor the number of nodes running the OSPF service, view switches with the most full and unestablished OSPF sessions, and view alarms triggered by the OSPF service. For an overview and how to configure OSPF to run in your data center network, refer to Open Shortest Path First - OSPF or Open Shortest Path First v3 - OSPFv3.

    OSPF Service Card Workflow

    The small OSPF Service card displays:

    ItemDescription
    Indicates data is for all sessions of a Network Service or Protocol
    TitleOSPF: All OSPF Sessions, or the OSPF Service
    Total number of switches and hosts with the OSPF service enabled during the designated time period
    Total number of OSPF-related alarms received during the designated time period
    ChartDistribution of OSPF-related alarms received during the designated time period

    The medium OSPF Service card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleNetwork Services | All OSPF Sessions
    Total number of switches and hosts with the OSPF service enabled during the designated time period
    Total number of OSPF-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the OSPF service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running OSPF last week or last month might be more or less than the number of nodes running OSPF currently.

    Total Sessions Not Established chart

    Distribution of unestablished OSPF sessions during the designated time period, and the total number of unestablished sessions in the network currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of unestablished session last week or last month might be more of less than the number of nodes with unestablished sessions currently.

    Total Sessions chartDistribution of OSPF sessions during the designated time period, and the total number of sessions running on the network currently.

    The large OSPF service card contains two tabs.

    The Sessions Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for all sessions of a Network Service or Protocol
    TitleSessions Summary (visible when you hover over card)
    Total number of switches and hosts with the OSPF service enabled during the designated time period
    Total number of OSPF-related alarms received during the designated time period
    Total Nodes Running chart

    Distribution of switches and hosts with the OSPF service enabled during the designated time period, and a total number of nodes running the service currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of nodes running OSPF last week or last month might be more or less than the number of nodes running OSPF currently.

    Total Sessions chartDistribution of OSPF sessions during the designated time period, and the total number of sessions running on the network currently.
    Total Sessions Not Established chart

    Distribution of unestablished OSPF sessions during the designated time period, and the total number of unestablished sessions in the network currently.

    Note: The node count here may be different than the count in the summary bar. For example, the number of unestablished session last week or last month might be more of less than the number of nodes with unestablished sessions currently.

    Table/Filter options

    When the Switches with Most Sessions filter option is selected, the table displays the switches and hosts running OSPF sessions in decreasing order of session count-devices with the largest number of sessions are listed first

    When the Switches with Most Unestablished Sessions filter option is selected, the table switches and hosts running OSPF sessions in decreasing order of unestablished sessions-devices with the largest number of unestablished sessions are listed first

    Show All SessionsLink to view data for all OSPF sessions in the full screen card

    The Alarms tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    (in header)Indicates data is all alarms for all OSPF sessions
    TitleAlarms (visible when you hover over card)
    Total number of switches and hosts with the OSPF service enabled during the designated time period
    (in summary bar)Total number of OSPF-related alarms received during the designated time period
    Total Alarms chart

    Distribution of OSPF-related alarms received during the designated time period, and the total number of current OSPF-related alarms in the network.

    Note: The alarm count here may be different than the count in the summary bar. For example, the number of new alarms received in this time period does not take into account alarms that have already been received and are still active. You might have no new alarms, but still have a total number of alarms present on the network of 10.

    Table/Filter optionsWhen the selected filter option is Switches with Most Alarms, the table displays switches and hosts running OSPF in decreasing order of the count of alarms-devices with the largest number of OSPF alarms are listed first
    Show All SessionsLink to view data for all OSPF sessions in the full screen card

    The full screen OSPF Service card provides tabs for all switches, all sessions, and all alarms.

    ItemDescription
    TitleNetwork Services | OSPF
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All Switches tabDisplays all switches and hosts running the OSPF service. By default, the device list is sorted by hostname. This tab provides the following additional data about each device:
    • Agent
      • State: Indicates communication state of the NetQ Agent on a given device. Values include Fresh (heard from recently) and Rotten (not heard from recently).
      • Version: Software version number of the NetQ Agent on a given device. This should match the version number of the NetQ software loaded on your server or appliance; for example, 2.1.0.
    • ASIC
      • Core BW: Maximum sustained/rated bandwidth. Example values include 2.0 T and 720 G.
      • Model: Chip family. Example values include Tomahawk, Trident, and Spectrum.
      • Model Id: Identifier of networking ASIC model. Example values include BCM56960 and BCM56854.
      • Ports: Indicates port configuration of the switch. Example values include 32 x 100G-QSFP28, 48 x 10G-SFP+, and 6 x 40G-QSFP+.
      • Vendor: Manufacturer of the chip. Example values include Broadcom and Mellanox.
    • CPU
      • Arch: Microprocessor architecture type. Values include x86_64 (Intel), ARMv7 (AMD), and PowerPC.
      • Max Freq: Highest rated frequency for CPU. Example values include 2.40 GHz and 1.74 GHz.
      • Model: Chip family. Example values include Intel Atom C2538 and Intel Atom C2338.
      • Nos: Number of cores. Example values include 2, 4, and 8.
    • Disk Total Size: Total amount of storage space in physical disks (not total available). Example values: 10 GB, 20 GB, 30 GB.
    • License State: Indicator of validity. Values include ok and bad.
    • Memory Size: Total amount of local RAM. Example values include 8192 MB and 2048 MB.
    • OS
      • Vendor: Operating System manufacturer. Values include Cumulus Networks, RedHat, Ubuntu, and CentOS.
      • Version: Software version number of the OS. Example values include 3.7.3, 2.5.x, 16.04, 7.1.
      • Version Id: Identifier of the OS version. For Cumulus, this is the same as the Version (3.7.x).
    • Platform
      • Date: Date and time the platform was manufactured. Example values include 7/12/18 and 10/29/2015.
      • MAC: System MAC address. Example value: 17:01:AB:EE:C3:F5.
      • Model: Manufacturer's model name. Examples values include AS7712-32X and S4048-ON.
      • Number: Manufacturer part number. Examples values include FP3ZZ7632014A, 0J09D3.
      • Revision: Release version of the platform
      • Series: Manufacturer serial number. Example values include D2060B2F044919GD000060, CN046MRJCES0085E0004.
      • Vendor: Manufacturer of the platform. Example values include Cumulus Express, Dell, EdgeCore, Lenovo, Mellanox.
    • Time: Date and time the data was collected from device.
    All Sessions tabDisplays all OSPF sessions network-wide. By default, the session list is sorted by hostname. This tab provides the following additional data about each session:
    • Area: Routing domain for this host device. Example values include 0.0.0.1, 0.0.0.23.
    • Ifname: Name of the interface on host device where session resides. Example values include swp5, peerlink-1.
    • Is IPv6: Indicates whether the address of the host device is IPv6 (true) or IPv4 (false)
    • Peer
      • Address: IPv4 or IPv6 address of the peer device
      • Hostname: User-defined name for peer device
      • ID: Network subnet address of router with access to the peer device
    • State: Current state of OSPF. Values include Full, 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.
    • Timestamp: Date and time session was started, deleted, updated or marked dead (device is down)
    All Alarms tabDisplays all OSPF events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a OSPF-related event. Example: swp4 area ID mismatch with peer leaf02
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of OSPF in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Service Status Summary

    A summary of the OSPF service is available from the Network Services card workflow, including the number of nodes running the service, the number of OSPF-related alarms, and a distribution of those alarms.

    To view the summary, open the small OSPF Service card.

    For more detail, select a different size OSPF Service card.

    View the Distribution of Sessions

    It is useful to know the number of network nodes running the OSPF protocol over a period of time, as it gives you insight into the amount of traffic associated with and breadth of use of the protocol. It is also useful to view the health of the sessions.

    To view these distributions, open the medium OSPF Service card.

    You can dig a little deeper with the large OSPF Service card tabs.

    View Devices with the Most OSPF Sessions

    You can view the load from OSPF on your switches and hosts using the large Network Services card. This data enables you to see which switches are handling the most OSPF traffic currently, validate that is what is expected based on your network design, and compare that with data from an earlier time to look for any differences.

    To view switches and hosts with the most OSPF sessions:

    1. Open the large OSPF Service card.

    2. Select Switches with Most Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes running the most OSPF sessions at the top. Scroll down to view those with the fewest sessions.

    To compare this data with the same data at a previous time:

    1. Open another large OSPF Service card.

    2. Move the new card next to the original card if needed.

    3. Change the time period for the data on the new card by hovering over the card and clicking .

    4. Select the time period that you want to compare with the original time. We chose Past Week for this example.

      You can now see whether there are significant differences between this time and the original time. If the changes are unexpected, you can investigate further by looking at another timeframe, determining if more nodes are now running OSPF than previously, looking for changes in the topology, and so forth.

    View Devices with the Most Unestablished OSPF Sessions

    You can identify switches and hosts that are experiencing difficulties establishing OSPF sessions; both currently and in the past.

    To view switches with the most unestablished OSPF sessions:

    1. Open the large OSPF Service card.

    2. Select Switches with Most Unestablished Sessions from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most unestablished OSPF sessions at the top. Scroll down to view those with the fewest unestablished sessions.

    Where to go next depends on what data you see, but a couple of options include:

    Switches or hosts experiencing a large number of OSPF alarms may indicate a configuration or performance issue that needs further investigation. You can view the devices sorted by the number of OSPF alarms and then use the Switches card workflow or the Alarms card workflow to gather more information about possible causes for the alarms. compare the number of nodes running OSPF with unestablished sessions with the alarms present at the same time to determine if there is any correlation between the issues and the ability to establish an OSPF session.

    To view switches with the most OSPF alarms:

    1. Open the large OSPF Service card.

    2. Hover over the header and click .

    3. Select Switches with Most Alarms from the filter above the table.

      The table content is sorted by this characteristic, listing nodes with the most OSPF alarms at the top. Scroll down to view those with the fewest alarms.

    Where to go next depends on what data you see, but a few options include:

    View All OSPF Events

    The OSPF Network Services card workflow enables you to view all of the OSPF events in the designated time period.

    To view all OSPF events:

    1. Open the full screen OSPF Service card.

    2. Click All Alarms tab in the navigation panel. By default, events are listed in most recent to least recent order.

    Where to go next depends on what data you see, but a couple of options include:

    View Details for All Devices Running OSPF

    You can view all stored attributes of all switches and hosts running OSPF in your network in the full screen card.

    To view all device details, open the full screen OSPF Service card and click the All Switches tab.

    To return to your workbench, click in the top right corner.

    View Details for All OSPF Sessions

    You can view all stored attributes of all OSPF sessions in your network in the full-screen card.

    To view all session details, open the full screen OSPF Service card and click the All Sessions tab.

    To return to your workbench, click in the top right corner.

    Use the icons above the table to select/deselect, filter, and export items in the list. Refer to Table Settings for more detail. To return to original display of results, click the associated tab.

    Monitor a Single OSPF Session

    With NetQ, you can monitor a single session of the OSPF service, view session state changes, and compare with alarms occurring at the same time, as well as monitor the running OSPF configuration and changes to the configuration file. For an overview and how to configure OSPF to run in your data center network, refer to Open Shortest Path First - OSPF or Open Shortest Path First v3 - OSPFv3.

    To access the single session cards, you must open the full screen OSPF Service, click the All Sessions tab, select the desired session, then click (Open Cards).

    Granularity of Data Shown Based on Time Period

    On the medium and large single OSPF session cards, the status of the sessions is represented in heat maps stacked vertically; one for established sessions, and one for unestablished sessions. Depending on the time period of data on the card, the number of smaller time blocks used to indicate the status varies. A vertical stack of time blocks, one from each map, includes the results from all checks during that time. The results are shown by how saturated the color is for each block. If all sessions during that time period were established for the entire time block, then the top block is 100% saturated (white) and the not established block is zero percent saturated (gray). As sessions that are not established increase in saturation, the sessions that are established block is proportionally reduced in saturation. An example heat map for a time period of 24 hours is shown here with the most common time periods in the table showing the resulting time blocks.

    Time PeriodNumber of RunsNumber Time BlocksAmount of Time in Each Block
    6 hours1861 hour
    12 hours36121 hour
    24 hours72241 hour
    1 week50471 day
    1 month2,086301 day
    1 quarter7,000131 week

    OSPF Session Card Workflow Summary

    The small OSPF Session card displays:

    ItemDescription
    Indicates data is for a single session of a Network Service or Protocol
    TitleOSPF Session
    Hostnames of the two devices in a session. Host appears on top with peer below.
    , Current state of OSPF.
    Full or 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.

    The medium OSPF Session card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    TitleNetwork Services | OSPF Session
    Hostnames of the two devices in a session. Host appears on top with peer below.
    , Current state of OSPF.
    Full or 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.
    Time period for chartTime period for the chart data
    Session State Changes ChartHeat map of the state of the given session over the given time period. The status is sampled at a rate consistent with the time period. For example, for a 24 hour period, a status is collected every hour. Refer to Granularity of Data Shown Based on Time Period.
    IfnameInterface name on or hostname for host device where session resides
    Peer AddressIP address of the peer device
    Peer IDIP address of router with access to the peer device

    The large OSPF Session card contains two tabs.

    The Session Summary tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates data is for a single session of a Network Service or Protocol
    TitleSession Summary (Network Services | OSPF Session)
    Summary bar

    Hostnames of the two devices in a session. Arrow points in the direction of the session.

    Current state of OSPF. Full or 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.

    Session State Changes ChartHeat map of the state of the given session over the given time period. The status is sampled at a rate consistent with the time period. For example, for a 24 hour period, a status is collected every hour. Refer to Granularity of Data Shown Based on Time Period.
    Alarm Count ChartDistribution and count of OSPF alarm events over the given time period
    Info Count ChartDistribution and count of OSPF info events over the given time period
    IfnameName of the interface on the host device where the session resides
    StateCurrent state of OSPF.
    Full or 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.
    Is UnnumberedIndicates if the session is part of an unnumbered OSPF configuration (true) or part of a numbered OSPF configuration (false)
    Nbr CountNumber of routers in the OSPF configuration
    Is PassiveIndicates if the host is in a passive state (true) or active state (false).
    Peer IDIP address of router with access to the peer device
    Is IPv6Indicates if the IP address of the host device is IPv6 (true) or IPv4 (false)
    If UpIndicates if the interface on the host is up (true) or down (false)
    Nbr Adj CountNumber of adjacent routers for this host
    MTUMaximum transmission unit (MTU) on shortest path between the host and peer
    Peer AddressIP address of the peer device
    AreaRouting domain of the host device
    Network TypeArchitectural design of the network. Values include Point-to-Point and Broadcast.
    CostShortest path through the network between the host and peer devices
    Dead TimeCountdown timer, starting at 40 seconds, that is constantly reset as messages are heard from the neighbor. If the dead time gets to zero, the neighbor is presumed dead, the adjacency is torn down, and the link removed from SPF calculations in the OSPF database.

    The Configuration File Evolution tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates configuration file information for a single session of a Network Service or Protocol
    Title(Network Services | OSPF Session) Configuration File Evolution
    Device identifiers (hostname, IP address, or MAC address) for host and peer in session. Arrow points from the host to the peer. Click to open associated device card.
    , Current state of OSPF.
    Full or 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.
    TimestampsWhen changes to the configuration file have occurred, the date and time are indicated. Click the time to see the changed file.
    Configuration File

    When File is selected, the configuration file as it was at the selected time is shown.

    When Diff is selected, the configuration file at the selected time is shown on the left and the configuration file at the previous timestamp is shown on the right. Differences are highlighted.

    The full screen OSPF Session card provides tabs for all OSPF sessions and all events.

    ItemDescription
    TitleNetwork Services | OSPF
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    Displays data refresh status. Click to pause data refresh. Click to resume data refresh. Current refresh rate is visible by hovering over icon.
    ResultsNumber of results found for the selected tab
    All OSPF Sessions tabDisplays all OSPF sessions running on the host device. The session list is sorted by hostname by default. This tab provides the following additional data about each session:
    • Area: Routing domain for this host device. Example values include 0.0.0.1, 0.0.0.23.
    • Ifname: Name of the interface on host device where session resides. Example values include swp5, peerlink-1.
    • Is IPv6: Indicates whether the address of the host device is IPv6 (true) or IPv4 (false)
    • Peer
      • Address: IPv4 or IPv6 address of the peer device
      • Hostname: User-defined name for peer device
      • ID: Network subnet address of router with access to the peer device
    • State: Current state of OSPF. Values include Full, 2-way, Attempt, Down, Exchange, Exstart, Init, and Loading.
    • Timestamp: Date and time session was started, deleted, updated or marked dead (device is down)
    All Events tabDisplays all events network-wide. By default, the event list is sorted by time, with the most recent events listed first. The tab provides the following additional data about each event:
    • Message: Text description of a OSPF-related event. Example: OSPF session with peer tor-1 swp7 vrf default state changed from failed to Established
    • Source: Hostname of network device that generated the event
    • Severity: Importance of the event. Values include critical, warning, info, and debug.
    • Type: Network protocol or service generating the event. This always has a value of OSPF in this card workflow.
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Session Status Summary

    A summary of the OSPF session is available from the OSPF Session card workflow, showing the node and its peer and current status.

    To view the summary:

    1. Add the Network Services | All OSPF Sessions card.

    2. Switch to the full screen card.

    3. Click the All Sessions tab.

    4. Double-click the session of interest. The full screen card closes automatically.

    5. Optionally, switch to the small OSPF Session card.

    View OSPF Session State Changes

    You can view the state of a given OSPF session from the medium and large OSPF Session Network Service cards. For a given time period, you can determine the stability of the OSPF session between two devices. If you experienced connectivity issues at a particular time, you can use these cards to help verify the state of the session. If it was not established more than it was established, you can then investigate further into possible causes.

    To view the state transitions for a given OSPF session, on the medium OSPF Session card:

    1. Add the Network Services | All OSPF Sessions card.

    2. Switch to the full screen card.

    3. Open the large OSPF Service card.

    4. Click the All Sessions tab.

    5. Double-click the session of interest. The full screen card closes automatically.

    The heat map indicates the status of the session over the designated time period. In this example, the session has been established for the entire time period.

    From this card, you can also view the interface name, peer address, and peer id identifying the session in more detail.

    To view the state transitions for a given OSPF session on the large OSPF Session card, follow the same steps to open the medium OSPF Session card and then switch to the large card.

    From this card, you can view the alarm and info event counts, interface name, peer address and peer id, state, and several other parameters identifying the session in more detail.

    View Changes to the OSPF Service Configuration File

    Each time a change is made to the configuration file for the OSPF service, NetQ logs the change and enables you to compare it with the last version. This can be useful when you are troubleshooting potential causes for alarms or sessions losing their connections.

    To view the configuration file changes:

    1. Open the large OSPF Session card.

    2. Hover over the card and click to open the Configuration File Evolution tab.

    3. Select the time of interest on the left; when a change may have impacted the performance. Scroll down if needed.

    4. Choose between the File view and the Diff view (selected option is dark; File by default).

      The File view displays the content of the file for you to review.

      The Diff view displays the changes between this version (on left) and the most recent version (on right) side by side. The changes are highlighted in red and green. In this example, we don’t have a change to highlight, so it shows the same file on both sides.

    View All OSPF Session Details

    You can view all stored attributes of all of the OSPF sessions associated with the two devices on this card.

    To view all session details, open the full screen OSPF Session card, and click the All OSPF Sessions tab.

    To return to your workbench, click in the top right corner.

    View All Events

    You can view all of the alarm and info events for the two devices on this card.

    To view all events, open the full screen OSPF Session card, and click the All Events tab.

    To return to your workbench, click in the top right corner.

    Monitor Network Connectivity

    It is helpful to verify that communications are freely flowing between the various devices in your network. You can verify the connectivity between two devices in both an adhoc fashion and by defining connectivity checks to occur on a scheduled basis. There are three card workflows which enable you to view connectivity, the Trace Request, On-demand Trace Results, and Scheduled Trace Results.

    Create a Trace Request

    Two types of connectivity checks can be run-an immediate (on-demand) trace and a scheduled trace. The Trace Request card workflow is used to configure and run both of these trace types.

    Trace Request Card Workflow Summary

    The small Trace Request card displays:

    ItemDescription
    Indicates a trace request
    Select Trace listSelect a scheduled trace request from the list
    GoClick to start the trace now

    The medium Trace Request card displays:

    ItemDescription
    Indicates a trace request
    TitleNew Trace Request
    New Trace RequestCreate a new layer 3 trace request. Use the large Trace Request card to create a new layer 2 or 3 request.
    Source(Required) Hostname or IP address of device where to begin the trace
    Destination(Required) IP address of device where to end the trace
    Run NowStart the trace now

    The large Trace Request card displays:

    ItemDescription
    Indicates a trace request
    TitleNew Trace Request
    Trace selectionLeave New Trace Request selected to create a new request, or choose a scheduled request from the list.
    Source(Required) Hostname or IP address of device where to begin the trace.
    Destination(Required) Ending point for the trace. For layer 2 traces, value must be a MAC address. For layer 3 traces, value must be an IP address.
    VRFOptional for layer 3 traces. Virtual Route Forwarding interface to be used as part of the trace path.
    VLAN IDRequired for layer 2 traces. Virtual LAN to be used as part of the trace path.
    ScheduleSets the frequency with which to run a new trace (Run every) and when to start the trace for the first time (Starting).
    Run NowStart the trace now
    UpdateUpdate is available when a scheduled trace request is selected from the dropdown list and you make a change to its configuration. Clicking Update saves the changes to the existing scheduled trace.
    Save As NewSave As New is available in two instances:
    • When you enter a source, destination, and schedule for a new trace. Clicking Save As New in this instance saves the new scheduled trace.
    • When changes are made to a selected scheduled trace request. Clicking Save As New in this instance saves the modified scheduled trace without changing the original trace on which it was based.

    The full screen Trace Request card displays:

    ItemDescription
    TitleTrace Request
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    ResultsNumber of results found for the selected tab
    Schedule Preview tabDisplays all scheduled trace requests for the given user. By default, the listing is sorted by Start Time, with the most recently started traces listed at the top. The tab provides the following additional data about each event:
    • Action: Indicates latest action taken on the trace job. Values include Add, Deleted, Update.
    • Frequency: How often the trace is scheduled to run
    • Active: Indicates if trace is actively running (true), or stopped from running (false)
    • ID: Internal system identifier for the trace job
    • Trace Name: User-defined name for a trace
    • Trace Params: Indicates source and destination, optional VLAN or VRF specified, and whether to alert on failure
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    Create a Layer 3 On-demand Trace Request

    It is helpful to verify the connectivity between two devices when you suspect an issue is preventing proper communication between them. It you cannot find a path through a layer 3 path, you might also try checking connectivity through a layer 2 path.

    To create a layer 3 trace request:

    1. Open the medium Trace Request card.

    2. In the Source field, enter the hostname or IP address of the device where you want to start the trace.

    3. In the Destination field, enter the IP address of the device where you want to end the trace.

      In this example, we are starting our trace at server02 and ending it at 10.1.3.103.

      If you mistype an address, you must double-click it, or backspace over the error, and retype the address. You cannot select the address by dragging over it as this action attempts to move the card to another location.

    4. Click Run Now. A corresponding Trace Results card is opened on your workbench. Refer to View Layer 3 Trace Results for details.

    Create a Layer 3 Trace Through a Given VRF

    If you want to guide a trace through a particular VRF interface, you can do so using the large New Trace Request card.

    To create the trace request:

    1. Open the large Trace Request card.

    2. In the Source field, enter the hostname or IP address of the device where you want to start the trace.

    3. In the Destination field, enter the IP address of the device where you want to end the trace.

    4. In the VRF field, enter the identifier for the VRF interface you want to use.

      In this example, we are starting our trace at leaf01 and ending it at 10.1.3.103 using VRF vrf1.

    5. Click Run Now. A corresponding Trace Results card is opened on your workbench. Refer to View Layer 3 Trace Results for details.

    Create a Layer 2 Trace

    It is helpful to verify the connectivity between two devices when you suspect an issue is preventing proper communication between them. It you cannot find a path through a layer 2 path, you might also try checking connectivity through a layer 3 path.

    To create a layer 2 trace request:

    1. Open the large Trace Request card.

    2. In the Source field, enter the hostname or IP address of the device where you want to start the trace.

    3. In the Destination field, enter the MAC address for where you want to end the trace.

    4. In the VLAN ID field, enter the identifier for the VLAN you want to use.

      In this example, we are starting our trace at leaf01 and ending it at 00:03:00:33:33:01 using VLAN 13.

    5. Click Run Now. A corresponding Trace Results card is opened on your workbench. Refer to View Layer 2 Trace Results for details.

    Create a Trace to Run on a Regular Basis (Scheduled Trace)

    There may be paths through your network that you consider critical to your everyday or particularly important operations. In that case, it might be useful to create one or more traces to periodically confirm that at least one path is available between the relevant two devices. Scheduling a trace request can be performed from the large Trace Request card.

    To schedule a trace:

    1. Open the large Trace Request card.

    2. In the Source field, enter the hostname or IP address of the device where you want to start the trace.

    3. In the Destination field, enter the MAC address (layer 2) or IP address (layer 3) of the device where you want to end the trace.

    4. Optionally, enter a VLAN ID (layer 2) or VRF interface (layer 3).

    5. Select a timeframe under Schedule to specify how often you want to run the trace.

    6. Accept the default starting time, or click in the Starting field to specify the day you want the trace to run for the first time.

    7. Click Next.

    8. Click the time you want the trace to run for the first time.

    9. Click OK.

    10. Verify your entries are correct, then click Save As New.

    11. Provide a name for the trace. Note: This name must be unique for a given user.

    12. Click Save. You can now run this trace on demand by selecting it from the dropdown list, or wait for it to run on its defined schedule.

    Run a Scheduled Trace on Demand

    You may find that, although you have a schedule for a particular trace, you want to have visibility into the connectivity data now. You can run a scheduled trace on demand from the small, medium and large Trace Request cards.

    To run a scheduled trace now:

    1. Open the small or medium or large Trace Request card.

    2. Select the scheduled trace from the Select Trace or New Trace Request list. Note: In the medium and large cards, the trace details are filled in on selection of the scheduled trace.

    3. Click Go or Run Now. A corresponding Trace Results card is opened on your workbench.

    View On-demand Trace Results

    Once you have started an on-demand trace, the results are displayed in the medium Trace Results card by default. You may view the results in more or less detail by switching to the large or small Trace Results card, respectively.

    On-demand Trace Results Card Workflow Summary

    The small On-demand Trace Results card displays:

    ItemDescription
    Indicates an on-demand trace result
    Source and destination of the trace, identified by their address or hostname. Source is listed on top with arrow pointing to destination.
    , Indicates success or failure of the trace request. A successful result implies all paths were successful without any warnings or failures. A failure result implies there was at least one path with warnings or errors.

    The medium On-demand Trace Results card displays:

    ItemDescription
    Indicates an on-demand trace result
    TitleOn-demand Trace Result
    Source and destination of the trace, identified by their address or hostname. Source is listed on top with arrow pointing to destination.
    , Indicates success or failure of the trace request. A successful result implies all paths were successful without any warnings or failures. A failure result implies there was at least one path with warnings or errors.
    Total Paths FoundNumber of paths found between the two devices
    MTU OverallAverage size of the maximum transmission unit for all paths
    Minimum HopsSmallest number of hops along a path between the devices
    Maximum HopsLargest number of hops along a path between the devices

    The large On-demand Trace Results card contains two tabs.

    The On-demand Trace Result tab displays:

    ItemDescription
    Indicates an on-demand trace result
    TitleOn-demand Trace Result
    , Indicates success or failure of the trace request. A successful result implies all paths were successful without any warnings or failures. A failure result implies there was at least one path with warnings or errors.
    Source and destination of the trace, identified by their address or hostname. Source is listed on top with arrow pointing to destination.
    Distribution by Hops chartDisplays the distributions of various hop counts for the available paths
    Distribution by MTU chartDisplays the distribution of MTUs used on the interfaces used in the available paths
    TableProvides detailed path information, sorted by the route identifier, including:
    • Route ID: Identifier of each path
    • MTU: Average speed of the interfaces used
    • Hops: Number of hops to get from the source to the destination device
    • Warnings: Number of warnings encountered during the trace on a given path
    • Errors: Number of errors encountered during the trace on a given path
    Total Paths FoundNumber of paths found between the two devices
    MTU OverallAverage size of the maximum transmission unit for all paths
    Minimum HopsSmallest number of hops along a path between the devices

    The On-demand Trace Settings tab displays:

    ItemDescription
    Indicates an on-demand trace setting
    TitleOn-demand Trace Settings
    SourceStarting point for the trace
    DestinationEnding point for the trace
    ScheduleDoes not apply to on-demand traces
    VRFAssociated virtual route forwarding interface, when used with layer 3 traces
    VLANAssociated virtual local area network, when used with layer 2 traces
    Job IDIdentifier of the job; used internally
    Re-run TraceClicking this button runs the trace again

    The full screen On-demand Trace Results card displays:

    ItemDescription
    TitleOn-demand Trace Results
    Closes full screen card and returns to workbench
    Time periodRange of time in which the displayed data was collected; applies to all card sizes; select an alternate time period by clicking
    ResultsNumber of results found for the selected tab
    Trace Results tabProvides detailed path information, sorted by the Resolution Time (date and time results completed), including:
    • SCR.IP: Source IP address
    • DST.IP: Destination IP address
    • Max Hop Count: Largest number of hops along a path between the devices
    • Min Hop Count: Smallest number of hops along a path between the devices
    • Total Paths: Number of paths found between the two devices
    • PMTU: Average size of the maximum transmission unit for all interfaces along the paths
    • Errors: Message provided for analysis when a trace fails
    Table ActionsSelect, export, or filter the list. Refer to Table Settings.

    View Layer 2 Trace Results

    When you start the trace, the corresponding results card is opened on your workbench. While it is working on the trace, a notice is shown on the card indicating it is running.

    Once the job is completed, the results are displayed.

    In this example, we see that the trace was successful. Four paths were found between the devices, each with four hops and with an overall MTU of 1500. If there was a difference between the minimum and maximum number of hops or other failures, viewing the results on the large card would provide additional information.

    In our example, we can verify that every path option had four hops since the distribution chart only shows one hop count and the table indicates each path had a value of four hops. Similarly, you can view the MTU data. If there had been any warnings, the count would have been visible in the table.

    View Layer 3 Trace Results

    When you start the trace, the corresponding results card is opened on your workbench. While it is working on the trace, a notice is shown on the card indicating it is running.

    Once results are obtained, it displays them. Using our example from earlier, the following results are shown:

    In this example, we see that the trace was successful. Six paths were found between the devices, each with five hops and with an overall MTU of 1500. If there was a difference between the minimum and maximum number of hops or other failures, viewing the results on the large card would provide additional information.

    In our example, we can verify that every path option had five hops since the distribution chart only shows one hop count and the table indicates each path had a value of five hops. Similarly, you can view the MTU data. If there had been any warnings, the count would have been visible in the table.

    View Detailed On-demand Trace Results

    After the trace request has completed, the results are available in the corresponding medium Trace Results card.

    To view the more detail:

    1. Open the full-screen Trace Results card for the trace of interest.

    2. Double-click on the trace of interest to open the detail view.

      The tabular view enables you to walk through the trace path, host by host, viewing the interfaces, ports, tunnels, VLANs, and so forth used to traverse the network from the source to the destination.

    3. If the trace was run on a Mellanox switch and drops were detected by the What Just Happened feature, they are identified above the path. Click the down arrow to view the list of drops and their details. Click the up arrow to close the list.

    View Scheduled Trace Results

    You can view the results of scheduled traces at any time. Results are displayed on the Scheduled Trace Results cards.

    Scheduled Trace Results Card Workflow Summary

    The small Scheduled Trace Results card displays:

    ItemDescription
    Indicates a scheduled trace result
    Source and destination of the trace, identified by their address or hostname. Source is listed on left with arrow pointing to destination.
    ResultsSummary of trace results: a successful result implies all paths were successful without any warnings or failures; a failure result implies there was at least one path with warnings or errors.
    • Number of trace runs completed in the designated time period
    • Number of runs with warnings
    • Number of runs with errors

    The medium Scheduled Trace Results card displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates a scheduled trace result
    TitleScheduled Trace Result
    SummaryName of scheduled validation and summary of trace results: a successful result implies all paths were successful without any warnings or failures; a failure result implies there was at least one path with warnings or errors.
    • Number of trace runs completed in the designated time period
    • Number of runs with warnings
    • Number of runs with errors
    Charts

    Heat map: A time segmented view of the results. For each time segment, the color represents the percentage of warning and failed results. Refer to Granularity of Data Shown Based on Time Period for details on how to interpret the results.

    Unique Bad Nodes: Distribution of unique nodes that generated the indicated warnings and/or failures

    The large Scheduled Trace Results card contains two tabs:

    The Results tab displays:

    ItemDescription
    Time periodRange of time in which the displayed data was collected; applies to all card sizes
    Indicates a scheduled trace result
    TitleScheduled Trace Result
    SummaryName of scheduled validation and summary of trace results: a successful result implies all paths were successful without any warnings or failures; a failure result implies there was at least one path with warnings or errors.
    • Number of trace runs completed in the designated time period
    • Number of runs with warnings
    • Number of runs with errors
    Charts

    Heat map: A time segmented view of the results. For each time segment, the color represents the percentage of warning and failed results. Refer to Granularity of Data Shown Based on Time Period for details on how to interpret the results.

    Small charts: Display counts for each item during the same time period, for the purpose of correlating with the warnings and errors shown in the heat map.

    Table/Filter options

    When the Failures filter option is selected, the table displays the failure messages received for each run.

    When the Paths filter option is selected, the table displays all of the paths tried during each run.

    When the Warning filter option is selected, the table displays the warning messages received for each run.

    The Configuration tab displays: