NVIDIA Cumulus Linux

NVIDIA NetQ 4.4 User Guide

NVIDIA® NetQ™ is a scalable, modern network operations tool set that provides visibility into your overlay and underlay networks, enabling troubleshooting in real-time. NetQ delivers data and statistics about the health of your data center—from the container, virtual machine, or host, all the way to the switch and port. NetQ correlates configuration and operational status, and tracks state changes while simplifying management for the entire Linux-based data center. With NetQ, network operations change from a manual, reactive, node-by-node approach to an automated, informed, and agile one. Visit Network Operations with NetQ to learn more.

This user guide provides documentation for network administrators who are responsible for deploying, configuring, monitoring, and troubleshooting the network in their data center or campus environment.

For a list of the new features in this release, see What's New. For bug fixes and known issues, refer to the release notes.

What's New

This page summarizes new features and improvements for the NetQ 4.4 release. For a complete list of open and fixed issues, see the release notes.

What’s New in NetQ 4.4.1

This release includes important security updates. NVIDIA recommends upgrading to this release to improve software security and reliability.

What’s New in NetQ 4.4.0

This release includes several performance and infrastructure improvements that make NetQ faster and more reliable. It also includes extensive security enhancements and bug fixes. NVIDIA recommends upgrading to this release to improve software security and reliability. Additional updates include:

Command line updates:

User guide updates:

Upgrade Paths

You can upgrade to NetQ 4.4 directly from versions 4.1.0 or later. Upgrades from releases earlier than NetQ 4.1.0 require a fresh installation or an incremental upgrade to version 4.1.0 first.

NetQ no longer supports the Admin UI for installation and upgrades. Follow the updated instructions according to your deployment model.

Compatible Agent Versions

NetQ 4.4 is compatible with NetQ Agent versions 4.3.0 and above. You can install NetQ Agents on switches and servers running:

NetQ Overview

This section describes NetQ components and deployment models. It also outlines how to get started with the NetQ user interface and command line.

NetQ Basics

This section provides an overview of the NetQ hardware, software, and deployment models.

NetQ Components

NetQ contains the following applications and key components:

While these functions apply to both the on-premises and cloud solutions, they are configured differently, as shown in the following diagrams.

diagram of NetQ on-premises configuration
diagram of NetQ cloud configuration

NetQ Agents

NetQ Agents are installed via software and run on every monitored node in the network—including Cumulus® Linux® switches, Linux bare metal hosts, and virtual machines. The NetQ Agents push network data regularly and event information immediately to the NetQ Platform.

Switch Agents

The NetQ Agents running on Cumulus Linux or SONiC switches gather the following network data via Netlink:

for the following protocols:

Host Agents

The NetQ Agents running on hosts gather the same information as that for switches, plus the following network data:

The NetQ Agent obtains container information by listening to the Kubernetes orchestration tool.

The NetQ Agent is supported on hosts running Ubuntu 18.04, Red Hat® Enterprise Linux 7, and CentOS 7.

NetQ Core

The NetQ core performs the data collection, storage, and processing for delivery to various user interfaces. It consists of a collection of scalable components running entirely within a single server. The NetQ software queries this server, rather than individual devices, enabling greater system scalability.

Data Aggregation

The data aggregation component collects data coming from all of the NetQ Agents. It then filters, compresses, and forwards the data to the streaming component. The server monitors for missing messages and also monitors the NetQ Agents themselves, sending notifications about events when appropriate. In addition to the telemetry data collected from the NetQ Agents, the aggregation component collects information from the switches and hosts, such as vendor, model, version, and basic operational state.

Data Stores

NetQ uses two types of data stores. The first stores the raw data, data aggregations, and discrete events needed for quick response to data requests. The second stores data based on correlations, transformations, and raw-data processing.

Real-time Streaming

The streaming component processes the incoming raw data from the aggregation server in real time. It reads the metrics and stores them as a time series, and triggers alarms based on anomaly detection, thresholds, and events.

Network Services

The network services component monitors protocols and services operation individually and on a networkwide basis and stores status details.

User Interfaces

NetQ data is available through several interfaces:

The CLI and UI query the RESTful API to present data. NetQ can integrate with event notification applications and third-party analytics tools.

Data Center Network Deployments

This section describes three common data center deployment types for network management:

NetQ operates over layer 3, and can operate in both layer 2 bridged and layer 3 routed environments. NVIDIA recommends a layer 3 routed environment whenever possible.

Out-of-band Management Deployment

NVIDIA recommends deploying NetQ on an out-of-band (OOB) management network to separate network management traffic from standard network data traffic.

The physical network hardware includes:

The following figure shows an example of a Clos network fabric design for a data center using an OOB management network overlaid on top, where NetQ resides. The physical connections (shown as gray lines) between Spine 01 and four Leaf devices and two Exit devices, and Spine 02 and the same four Leaf devices and two Exit devices. Leaf 01 and Leaf 02 connect to each other over a peerlink and act as an MLAG pair for Server 01 and Server 02. Leaf 03 and Leaf 04 connect to each other over a peerlink and act as an MLAG pair for Server 03 and Server 04. The Edge connects to both Exit devices, and the Internet node connects to Exit 01.

diagram of a Clos network displaying connections between spine switches, leafs, servers, and exit switches.

The physical management hardware includes:

These switches connect to each physical network device through a virtual network overlay, shown with purple lines.

diagram displaying connections between physical network hardwar and physical management hardware with a virtual network overlay

In-band Management Deployment

While not recommended, you can implement NetQ within your data network. In this scenario, there is no overlay and all traffic to and from the NetQ Agents and the NetQ Platform traverses the data paths along with your regular network traffic. The roles of the switches in the Clos network are the same, except that the NetQ Platform performs the aggregation function that the OOB management switch performed. If your network goes down, you might not have access to the NetQ Platform for troubleshooting.

diagram of an in-band management deployment. The NetQ Platform interacts with one border leaf.

High Availability Deployment

NetQ supports a high availability deployment for users who prefer a solution in which the collected data and processing provided by the NetQ Platform remains available through alternate equipment should the platform fail for any reason. In this configuration, three NetQ Platforms are deployed, with one as the master and two as workers (or replicas). NetQ Agents send data to all three switches so that if the master NetQ Platform fails, one of the replicas automatically becomes the master and continues to store and provide the telemetry data. The following example is based on an OOB-management configuration, and modified to support high availability for NetQ.

diagram of a high availability deployment with one master and two worker NetQ platforms.

NetQ Operation

In either in-band or out-of-band deployments, NetQ offers networkwide configuration and device management, proactive monitoring capabilities, and network performance diagnostics.

The NetQ Agent

From a software perspective, a network switch has software associated with the hardware platform, the operating system, and communications. For data centers, the software on a network switch is similar to the following diagram:

diagram illustrating how the NetQ Agent interacts with a switch or host.

The NetQ Agent interacts with the various components and software on switches and hosts and provides the gathered information to the NetQ Platform. You can view the data using the NetQ CLI or UI.

The NetQ Agent polls the user space applications for information about the performance of the various routing protocols and services that are running on the switch. Cumulus Linux supports BGP and OSPF routing protocols as well as static addressing through FRRouting (FRR). Cumulus Linux also supports LLDP and MSTP among other protocols, and a variety of services such as systemd and sensors. SONiC supports BGP and LLDP.

For hosts, the NetQ Agent also polls for performance of containers managed with Kubernetes. This information is used to calculate the network’s health and check if the network is configured and operating correctly.

The NetQ Agent interacts with the Netlink communications between the Linux kernel and the user space, listening for changes to the network state, configurations, routes, and MAC addresses. NetQ sends notifications about these changes so that network operators and administrators can respond quickly when changes are not expected or favorable.

The NetQ Agent also interacts with the hardware platform to obtain performance information about various physical components, such as fans and power supplies, on the switch. The agent measures operational states and temperatures, along with cabling information to allow for proactive maintenance.

The NetQ Platform

After the collected data is sent to and stored in the NetQ database, you can:

Validate Configurations

The NetQ CLI lets you validate your network’s health through two sets of commands: netq check and netq show. They extract the information from the network service component and event service. The network service component is continually validating the connectivity and configuration of the devices and protocols running on the network. Using the netq check and netq show commands displays the status of the various components and services on a networkwide and complete software stack basis. netq check and netq show commands are available for the following components and services:

Component or ServiceCheckShowComponent or ServiceCheckShow
AgentsXXLLDPX
BGPXXMACsX
MLAG (CLAG)XXMTUX
EventsXNTPXX
EVPNXXOSPFXX
InterfacesXXSensorsXX
InventoryXServicesX
IPv4/v6XVLANXX
KubernetesXVXLANXX

Monitor Communication Paths

The trace engine validates the available communication paths between two network devices. The corresponding netq trace command enables you to view all of the paths between the two devices and if there are any breaks in the paths. For more information about trace requests, refer to Verify Network Connectivity.

View Historical State and Configuration Info

You can run all check, show, and trace commands for current and past statuses. To investigate past issues, use the netq check command and look for configuration or operational issues around the time that NetQ timestamped event messages. Then use the netq show commands to view information about device configurations. You can also use the netq trace command to see what the connectivity looked like between any problematic nodes at a particular time.

For example, the following diagram shows issues on spine01, leaf04, and server03:

network diagram displaying issues on spine01, leaf04, and server03

An administrator can run the following commands from any switch in the network to determine the cause of a BGP error on spine01:

cumulus@switch:~$ netq check bgp around 30m
Total Nodes: 25, Failed Nodes: 3, Total Sessions: 220 , Failed Sessions: 24,
Hostname          VRF             Peer Name         Peer Hostname     Reason                                        Last Changed
----------------- --------------- ----------------- ----------------- --------------------------------------------- -------------------------
exit-1            DataVrf1080     swp6.2            firewall-1        BGP session with peer firewall-1 swp6.2: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1080     swp7.2            firewall-2        BGP session with peer firewall-2 (swp7.2 vrf  1d:1h:59m:43s
                                                                      DataVrf1080) failed,                         
                                                                      reason: Peer not configured                  
exit-1            DataVrf1081     swp6.3            firewall-1        BGP session with peer firewall-1 swp6.3: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1081     swp7.3            firewall-2        BGP session with peer firewall-2 (swp7.3 vrf  1d:1h:59m:43s
                                                                      DataVrf1081) failed,                         
                                                                      reason: Peer not configured                  
exit-1            DataVrf1082     swp6.4            firewall-1        BGP session with peer firewall-1 swp6.4: AFI/ 1d:2h:6m:21s
                                                                      SAFI evpn not activated on peer              
exit-1            DataVrf1082     swp7.4            firewall-2        BGP session with peer firewall-2 (swp7.4 vrf  1d:1h:59m:43s
                                                                      DataVrf1082) failed,                         
                                                                      reason: Peer not configured                  
exit-1            default         swp6              firewall-1        BGP session with peer firewall-1 swp6: AFI/SA 1d:2h:6m:21s
                                                                      FI evpn not activated on peer                
exit-1            default         swp7              firewall-2        BGP session with peer firewall-2 (swp7 vrf de 1d:1h:59m:43s
...
 
cumulus@switch:~$ netq exit-1 show bgp
Matching bgp records:
Hostname          Neighbor                     VRF             ASN        Peer ASN   PfxRx        Last Changed
----------------- ---------------------------- --------------- ---------- ---------- ------------ -------------------------
exit-1            swp3(spine-1)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp3.2(spine-1)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.3(spine-1)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp3.4(spine-1)              DataVrf1082     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4(spine-2)                default         655537     655435     27/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp4.2(spine-2)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.3(spine-2)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp4.4(spine-2)              DataVrf1082     655537     655435     13/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5(spine-3)                default         655537     655435     28/24/412    Fri Feb 15 17:20:00 2019
exit-1            swp5.2(spine-3)              DataVrf1080     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5.3(spine-3)              DataVrf1081     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp5.4(spine-3)              DataVrf1082     655537     655435     14/12/0      Fri Feb 15 17:20:00 2019
exit-1            swp6(firewall-1)             default         655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.2(firewall-1)           DataVrf1080     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.3(firewall-1)           DataVrf1081     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp6.4(firewall-1)           DataVrf1082     655537     655539     73/69/-      Fri Feb 15 17:22:10 2019
exit-1            swp7                         default         655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.2                       DataVrf1080     655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.3                       DataVrf1081     655537     -          NotEstd      Fri Feb 15 17:28:48 2019
exit-1            swp7.4                       DataVrf1082     655537     -          NotEstd      Fri Feb 15 17:28:48 2019

Manage Network Events

The NetQ notifier lets you capture and filter events for devices, components, protocols, and services. This is especially useful when an interface or routing protocol goes down and you want to get them back up and running as quickly as possible. You can improve resolution time significantly by creating filters that focus on topics appropriate for a particular group of users. You can create filters for events related to BGP and MLAG session states, interfaces, links, NTP and other services, fans, power supplies, and physical sensor measurements.

The following is an example of a Slack message received on a netq-notifier channel indicating that the BGP session on switch leaf04 interface swp2 has gone down:

example Slack message from netq notifier indicating session failures

For more information, refer to Events and Notifications.

Timestamps in NetQ

Every event or entry in the NetQ database is stored with a timestamp that reports when the NetQ Agent captured an event on the switch or server. This timestamp is based on the switch or server time where the NetQ Agent is running, and is pushed in UTC format.

Interface state, IP addresses, routes, ARP/ND table (IP neighbor) entries and MAC table entries carry a timestamp that represents the time an event occurred (such as when a route is deleted or an interface comes up).

Data that is captured and saved based on polling has a timestamp according to when the information was captured rather than when the event actually happened, though NetQ compensates for this if the data extracted provides additional information to compute a more precise time of the event. For example, BGP uptime can be used to determine when the event actually happened in conjunction with the timestamp.

Restarting a NetQ Agent on a device does not update the timestamps for existing objects to reflect this new restart time. NetQ preserves their timestamps relative to the original start time of the Agent. A rare exception is if you reboot the device between the time it takes the Agent to stop and restart; in this case, the time is still relative to the start time of the Agent.

Exporting NetQ Data

You can export data from the NetQ Platform in the CLI or UI:

Important File Locations

The following configuration and log files can help with troubleshooting:

FileDescription
/etc/netq/netq.ymlThe NetQ configuration file. This file appears only if you installed either the netq-apps package or the NetQ Agent on the system.
/var/log/netqd.logThe NetQ daemon log file for the NetQ CLI. This log file appears only if you installed the netq-apps package on the system.
/var/log/netq-agent.logThe NetQ Agent log file. This log file appears only if you installed the NetQ Agent on the system.

NetQ User Interface Overview

The NetQ user interface (UI) lets you access NetQ through a web browser, where you can visualize your network and interact with the display using a keyboard and mouse.

The NetQ UI is supported on Google Chrome and Mozilla Firefox. It is designed to be viewed on a display with a minimum resolution of 1920 × 1080 pixels.

Access the NetQ UI

This page describes how to log in and out of NetQ.

Log In to NetQ

  1. Open a new Chrome or Firefox browser window or tab.

  2. Enter the following URL into the address bar:

    • NetQ On-premises Appliance or VM: https://<hostname-or-ipaddress>:443
    • NetQ Cloud Appliance or VM: https://netq.nvidia.com
    NetQ login screen
  3. Log in.

    The following are the default usernames and passwords for UI access:

    • NetQ On-premises: admin, admin
    • NetQ Cloud: Use the credentials you created during setup. You should receive an email from NVIDIA titled NetQ Access Link.

Enter your username and password to log in. You can also log in with SSO if your company has enabled it.

Username and Password

  1. Locate the email you received from NVIDIA titled NetQ Access Link. Select Create Password.

  2. Enter a new password, then enter it again to confirm it.

  3. Log in using your email address and new password.

  4. Accept the Terms of Use after reading them.

    The default workbench opens, with your username and premises shown in the upper-right corner of the application.

SSO

  1. Follow the steps above until you reach the NetQ login screen.

  2. Select Sign up for SSO and enter your organization’s name.

  1. Enter your username and password.

  2. Create a new password and enter the new password again to confirm it.

  3. Click Update and Accept after reading the Terms of Use.

    The default workbench opens, with your username shown in the upper-right corner of the application.

  1. Enter your username.

  2. Enter your password.

    The user-specified home workbench is displayed. If a home workbench is not specified, then the default workbench is displayed.

Any workbench can be set as the home workbench. Click (User Settings), click Profiles and Preferences, then on the Workbenches card click to the left of the workbench name you want to be your home workbench.

Log Out of NetQ

  1. Select profile in the upper-right corner of the application.

  2. Select Log Out.

Application Layout

The NetQ UI contains two main areas:

workbench displaying task bar and 5 cards

Found in the application header, click Menu to navigate to:

HeaderMenu
  • Search: a search bar to quickly find an item on the main menu
  • Favorites: contains link to the user-defined favorite workbenches; Home points to the NetQ Workbench until reset by a user
  • Workbenches: contains links to all workbenches
  • Network: contains links to tabular data about various network elements and the What Just Happened feature
  • Notifications: contains link to threshold-based event rules and notification channel specifications
  • Admin: contains links to application management and lifecycle management features (only visible to users with Admin access role)

You can search for devices and cards in the Global Search field in the header. It behaves like most searches and can help you quickly find device information.

Clicking the NVIDIA logo takes you to your favorite workbench. For details about specifying your favorite workbench, refer to Set User Preferences.

Validation Summary View

Found in the header, the chart provides a view into the health of your network at a glance.

On initial start up of the application, it can take up to an hour to reach an accurate health indication as some processes only run every 30 minutes.

Workbenches

A workbench comprises a given set of cards. A pre-configured default workbench, NetQ Workbench, is available to get you started. You can create your own workbenches and add or remove cards to meet your particular needs. For more detail about managing your data using workbenches, refer to Focus Your Monitoring Using Workbenches.

Cards

Cards present information about your network for monitoring and troubleshooting. This is where you can expect to spend most of your time. Each card describes a particular aspect of the network. Cards are available in multiple sizes, from small to full screen. The level of the content on a card varies in accordance with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen view. Cards are collected onto a workbench where you see all of the data relevant to a task or set of tasks. You can add and remove cards from a workbench, move between cards and card sizes, and make copies of cards to show different levels of data at the same time. For details about working with cards, refer to Access Data with Cards.

User Settings

Each user can customize the NetQ application display, time zone and date format; change their account password; and manage their workbenches. This is all performed from User Settings > Profile & Preferences. For details, refer to Set User Preferences.

Focus Your Monitoring Using Workbenches

Workbenches are dashboards where you collect and view data. Two types of workbenches are available:

Both types of workbenches display a set of cards. Default workbenches are public (accessible to all users), whereas custom workbenches are private (viewing is restricted to the user who created them).

Default Workbenches

The default workbench contains Device Inventory, Switch Inventory, Events, and Validation Summary cards, giving you a high-level view of how your network is operating.

default netq workbench

Upon initial login, the NetQ Workbench opens. Upon subsequent logins, the last workbench you viewed opens.

Custom Workbenches

People with either administrative or user roles can create and save an unlimited number of custom workbenches. For example, you might create a workbench that:

Create a Workbench

  1. Select add icon New in the workbench header.

  2. Enter a name for the workbench and choose whether to set it as your default home workbench.

  3. Select the cards you would like to display on your new workbench.

    interface displaying the cards a user can select to add to their workbench
  4. Click Create to create your new workbench.

Refer to Access Data with Cards for information about interacting with cards on your workbenches.

Clone a Workbench

To create a duplicate of an existing workbench:

  1. Select Clone in the workbench header.

  2. Name the cloned workbench and select Clone.

Remove a Workbench

Admin accounts can remove any workbench, except for the default NetQ Workbench. User accounts can only remove workbenches they have created.

To remove a workbench:

  1. Select profile icon in the upper-right corner to open the User Settings options.

  2. Select Profile & Preferences.

  3. Locate the Workbenches card.

  4. Hover over the workbench you want to remove, and click Delete.

Open an Existing Workbench

There are several options for opening workbenches:

Manage Auto-refresh for Your Workbenches

You can specify how often to update the data displayed on your workbenches. Three refresh rates are available:

By default, auto-refresh is enabled and configured to update every 30 seconds.

Change Settings

To modify the auto-refresh setting:

  1. Select the dropdown next to Refresh.

  2. Select the refresh rate. A check mark is shown next to the current selection. The new refresh rate is applied immediately.

    refresh rate dropdown listng rate options of 30 seconds, 1 minute, and 2 minutes

Disable/Enable Auto-refresh

When you are troubleshooting and do not want the displayed data to update, you can disable auto-refresh then enable it when you are finished.

To disable or pause auto-refresh, select pause icon above Refresh in the workbench header. When you’re ready for the data to refresh, select play icon.

Access Data with Cards

Cards present information about your network for monitoring and troubleshooting; each card describes a particular aspect of the network. Cards are collected onto a workbench where all data relevant to a task or set of tasks is visible. You can add and remove cards from a workbench, increase or decrease their sizes, change the time period of the data shown on a card, and make copies of cards to show different levels of data at the same time.

Available Cards

Each card focuses on a particular aspect of your network. They include:

There are five additional network services cards for session monitoring, including BGP, MLAG, EVPN, OSPF, and LLDP.

Card Sizes

Cards are available in 4 sizes. The granularity of the content on a card varies with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen card.

Card Size Summary

Card SizeSmallMediumLargeFull Screen
Primary Purpose
  • Quick view of status, typically at the level of good or bad
  • View key performance parameters or statistics
  • Perform quick actions
  • Monitor for potential issues
  • View detailed performance and statistics
  • Perform actions
  • Compare and review related information
  • View all attributes for given network aspect
  • Analyze and visualize detailed data
  • Export and filter data

Card Actions

Add Cards to Your Workbench

  1. Click in the header.

  2. Locate and select the card(s) you want to add to your workbench.

  3. When you have selected the cards you want to add to your workbench, select Open cards:

    1 selected card to be added to a workbench

The cards are placed at the end of the set of cards currently on the workbench. You might need to scroll down to see them. Drag and drop the cards on the workbench to rearrange them.

Add Switch Cards to Your Workbench

You can add switch cards to a workbench through the Devices icon on the header or by searching for it in the Global Search field. To add a switch card from the header:

  1. Click , then select Open a device card.

  2. Enter the first few letters of the switch’s hostname.

  3. Select the device from the suggestions that appear:

    dropdown displaying switches
  4. Choose the card’s size, then select Add.

Remove Cards from Your Workbench

To remove all the cards from your workbench, click the Clear icon in the header. To remove an individual card:

  1. Hover over the card you want to remove.

  2. Click (More Actions menu).

  3. Select Remove.

The card is removed from the workbench, but not from the application.

Change the Size of the Card

  1. Hover over the top portion of the card until you see a rectangle. This is the size picker.

  2. Hover over the size picker and move the cursor right or left until the desired size option is highlighted.

    One-quarter width opens a small card. One-half width opens a medium card. Three-quarters width opens a large card. Full width opens a full-screen card.

  3. Select the size. When the card changes to the selected size, it might move to a different area on the workbench.

Change the Time Period for the Card Data

All cards have a default time period for the data shown on the card, typically the last 24 hours. You can change the time period to view the data during a different time range to aid analysis of previous or existing issues.

To change the time period for a card:

  1. Hover over the top portion of the card and select .

  2. Select a time period from the dropdown list.

    time options

Changing the time period in this manner only changes the time period for the given card.

Table Settings

You can manipulate the tabular data displayed in a full-screen card by filtering and sorting the columns. Hover over the column header and select it to sort the column. The data is sorted in ascending or descending order: A-Z, Z-A, 1-n, or n-1. The number of rows that can be sorted is limited to 10,000.

To reposition the columns, drag and drop them using your mouse. You can also export the data presented in the table by selecting .

The following icons are common in the full-screen card view:

IconActionDescription
Select AllSelects all items in the list.
Clear AllClears all existing selections in the list.
Add ItemAdds item to the list.
EditEdits the selected item.
DeleteRemoves the selected items.
FilterFilters the list using available parameters.
, Generate/Delete AuthKeysCreates or removes NetQ CLI authorization keys.
Open CardsOpens the corresponding validation or trace card(s).
Assign roleOpens role assignment options for switches.
ExportExports selected data into either a .csv or JSON-formatted file.

When there are many items in a table, NetQ loads up to 25 rows by default and provides the rest in additional table pages, accessible through the pagination controls. Pagination is displayed under the table.

Set User Preferences

Each user can customize the NetQ application display, change their account password, and manage their workbenches.

Configure Display Settings

The Display card contains the options for setting the application theme (light or dark), language, time zone, and date formats.

To configure the display settings:

  1. Select in the application header to open the User Settings options.

  2. Select Profile & Preferences.

  3. Locate the Display card:

    display card with fields specifying theme, language, time zone, and date format.
  4. In the Theme field, click to select either dark or light theme. The following figure shows the light theme:

    NetQ workbench displayed in light theme
  5. In the Time Zone field, click to change the time zone from the default.

    By default, the time zone is set to the user’s local time zone. If a time zone has not been selected, NetQ defaults to the current local time zone where NetQ is installed. All time values are based on this setting. This is displayed (and can also be changed) in the application header, and is based on Greenwich Mean Time (GMT). If your deployment is not local to you (for example, you want to view the data from the perspective of a data center in another time zone) you can change the display to a different time zone.

  6. In the Date Format field, select the date and time format you want displayed on the cards.

Change Your Password

  1. Click in the application header to open the User Settings options.

  2. Click Profile & Preferences.

  3. In the Basic Account Info card, select Change password.

  4. Enter your current password, followed by your new password.

  5. Select Save.

To reset the password for an admin account, follow these instructions.

Manage Your Workbenches

A workbench is similar to a dashboard. This is where you collect and view the data that is important to you. You can have more than one workbench and manage them with the Workbenches card located in Profile & Preferences. From the Workbenches card, you can view, sort, and delete workbenches. For a detailed overview of workbenches, see Focus Your Monitoring Using Workbenches.

NetQ Command Line Overview

The NetQ CLI provides access to all network state and event information collected by NetQ Agents. It behaves similarly to typical CLIs, with groups of commands that display related information, and help commands that provide additional information. There are four command categories: check, show, config, and trace.

The NetQ command line interface only runs on switches and server hosts implemented with Intel x86 or ARM-based architectures.

CLI Access

When you install or upgrade NetQ, you can also install and enable the CLI on your NetQ server or appliance and hosts.

To access the CLI from a switch or server:

  1. Log in to the device. The following example uses the default username of cumulus and a hostname of switch:

    <computer>:~<username>$ ssh cumulus@switch
    
  2. Enter your password to reach the command prompt. The default password is CumulusLinux!

    Enter passphrase for key '/Users/<username>/.ssh/id_rsa': <enter CumulusLinux! here>
    Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-112-generic x86_64)
        * Documentation:  https://help.ubuntu.com
        * Management:     https://landscape.canonical.com
        * Support:        https://ubuntu.com/advantage
    Last login: Tue Sep 15 09:28:12 2019 from 10.0.0.14
    cumulus@switch:~$
    
  3. You can now run commands:

    cumulus@switch:~$ netq show agents
    cumulus@switch:~$ netq check bgp
    

Command Line Basics

This section describes the core structure and behavior of the NetQ CLI. It includes the following:

Command Line Structure

The NetQ command line has a flat structure as opposed to a modal structure: you can run all commands from the standard command prompt instead of only in a specific mode, at the same level.

Command Syntax

All NetQ CLI commands begin with netq. NetQ commands fall into one of four syntax categories: validation (check), monitoring (show), configuration, and trace.

netq check <network-protocol-or-service> [options]
netq show <network-protocol-or-service> [options]
netq config <action> <object> [options]
netq trace <destination> from <source> [options]
SymbolsMeaning
Parentheses ( )Grouping of required parameters. Choose one.
Square brackets [ ]Single or group of optional parameters. If more than one object or keyword is available, choose one.
Angle brackets < >Required variable. Value for a keyword or option; enter according to your deployment nomenclature.
Pipe |Separates object and keyword options, also separates value options; enter one object or keyword and zero or one value.

For example, in the netq check command:

Examples of valid commands include:

Command Output

The command output presents results in color for many commands. Results with errors appear in red, and warnings appear in yellow. Results without errors or warnings appear in either black or green. VTEPs appear in blue. A node in the pretty output appears in bold, and angle brackets (< >) wrap around a router interface. To view the output with only black text, run the netq config del color command. You can view output with colors again by running netq config add color.

All check and show commands have a default timeframe of now to one hour ago, unless you specify an approximate time using the around keyword or a range using the between keyword. For example, running netq check bgp shows the status of BGP over the last hour. Running netq show bgp around 3h shows the status of BGP three hours ago.

When entering a time value, you must include a numeric value and the unit of measure:

  • w: weeks
  • d: days
  • h: hours
  • m: minutes
  • s: seconds
  • now

When using the between option, you can enter the start time (text-time) and end time (text-endtime) values as most recent first and least recent second, or vice versa. The values do not have to have the same unit of measure. Use the around option to view information for a particular time.

Command Prompts

NetQ code examples use the following prompts:

To use the NetQ CLI, the switches must be running the Cumulus Linux or SONiC operating system (OS), NetQ Platform or NetQ Collector software, the NetQ Agent, and the NetQ CLI. The hosts must be running CentOS, RHEL, or Ubuntu OS, the NetQ Agent, and the NetQ CLI. Refer to Install NetQ for additional information.

Command Completion

As you enter commands, you can get help with the valid keywords or options using the tab key. For example, using tab completion with netq check displays the possible objects for the command, and returns you to the command prompt to complete the command:

cumulus@switch:~$ netq check <<press Tab>>
    agents      :  Netq agent
    bgp         :  BGP info
    cl-version  :  Cumulus Linux version
    clag        :  Cumulus Multi-chassis LAG
    evpn        :  EVPN
    interfaces  :  network interface port
    mlag        :  Multi-chassis LAG (alias of clag)
    mtu         :  Link MTU
    ntp         :  NTP
    ospf        :  OSPF info
    sensors     :  Temperature/Fan/PSU sensors
    vlan        :  VLAN
    vxlan       :  VXLAN data path
cumulus@switch:~$ netq check

Command Help

As you enter commands, you can get help with command syntax by entering help at various points within a command entry. For example, to find out what options are available for a BGP check, enter help after entering some of the netq check command. In the following example, you can see that there are no additional required parameters and you can use three optional parameters — hostnames, vrf, and around — with a BGP check:

cumulus@switch:~$ netq check bgp help
Commands:
    netq check bgp [label <text-label-name> | hostnames <text-list-hostnames>] [vrf <vrf>] [check_filter_id <text-check-filter-id>] [include <bgp-number-range-list> | exclude <bgp-number-range-list>] [around <text-time>] [json | summary]
   netq show unit-tests bgp [check_filter_id <text-check-filter-id>] [json]

To see an exhaustive list of commands, run:

cumulus@switch:~$ netq help list

To get usage information for NetQ, run:

cumulus@switch:~$ netq help verbose

Command History

The CLI stores commands issued within a session, which lets you review and rerun commands that you already ran. At the command prompt, press the Up Arrow and Down Arrow keys to move back and forth through the list of commands previously entered. When you have found a given command, you can run the command by pressing Enter, just as you would if you had entered it manually. You can also modify the command before you run it.

Command Categories

While the CLI has a flat structure, NetQ commands are conceptually grouped into the following functional categories:

Validation Commands

The netq check commands validate the current or historical state of the network by looking for errors and misconfigurations in the network. The commands run fabric-wide validations against various configured protocols and services to determine how well the network is operating. You can perform validation checks for the following:

The commands take the form of netq check <network-protocol-or-service> [options], where the options vary according to the protocol or service.

Example check command

The following example shows the output for the netq check bgp command. If there were any failures, they would appear below the summary results or in the failedNodes section, respectively.

cumulus@switch:~$ netq check bgp
bgp check result summary:

Checked nodes       : 8
Total nodes         : 8
Rotten nodes        : 0
Failed nodes        : 0
Warning nodes       : 0

Additional summary:
Total Sessions      : 30
Failed Sessions     : 0

Session Establishment Test   : passed
Address Families Test        : passed
Router ID Test               : passed

Example check command in JSON format
cumulus@switch:~$ netq check bgp json
{
    "tests":{
        "Session Establishment":{
            "suppressed_warnings":0,
            "errors":[

            ],
            "suppressed_errors":0,
            "passed":true,
            "warnings":[

            ],
            "duration":0.0000853539,
            "enabled":true,
            "suppressed_unverified":0,
            "unverified":[

            ]
        },
        "Address Families":{
            "suppressed_warnings":0,
            "errors":[

            ],
            "suppressed_errors":0,
            "passed":true,
            "warnings":[

            ],
            "duration":0.0002634525,
            "enabled":true,
            "suppressed_unverified":0,
            "unverified":[

            ]
        },
        "Router ID":{
            "suppressed_warnings":0,
            "errors":[

            ],
            "suppressed_errors":0,
            "passed":true,
            "warnings":[

            ],
            "duration":0.0001821518,
            "enabled":true,
            "suppressed_unverified":0,
            "unverified":[

            ]
        }
    },
    "failed_node_set":[

    ],
    "summary":{
        "checked_cnt":8,
        "total_cnt":8,
        "rotten_node_cnt":0,
        "failed_node_cnt":0,
        "warn_node_cnt":0
    },
    "rotten_node_set":[

    ],
    "warn_node_set":[

    ],
    "additional_summary":{
        "total_sessions":30,
        "failed_sessions":0
    },
    "validation":"bgp"
}

Monitoring Commands

The netq show commands let you view details about the current or historical configuration and status of various protocols and services. You can view the configuration and status for the following:

The commands take the form of netq [<hostname>] show <network-protocol-or-service> [options], where the options vary according to the protocol or service. You can restrict the commands from showing the information for all devices to showing information only for a selected device using the hostname option.

Example show command

The following example shows the standard output for the netq show agents command:

cumulus@switch:~$ netq show agents
Matching agents records:
Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
border01          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:54 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:38 2020
border02          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:57 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:33 2020
fw1               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:44 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:26 2020
fw2               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:42 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:22 2020
leaf01            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 16:49:04 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:10 2020
leaf02            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:14 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:30 2020
leaf03            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:37 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:24 2020
leaf04            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:35 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:13 2020
oob-mgmt-server   Fresh            yes      3.1.1-ub18.04u29~1599111022.78b9e43  Mon Sep 21 16:43:58 2020  Mon Sep 21 17:55:00 2020  Mon Sep 21 17:55:00 2020   Thu Oct  1 16:07:31 2020
server01          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:16 2020
server02          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:24 2020
server03          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:56 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:12 2020
server04          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:17 2020
server05          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:25 2020
server06          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:21 2020
server07          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:48 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:28 2020
server08          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:45 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:31 2020
spine01           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:20 2020
spine02           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:33 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:16 2020
spine03           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:20 2020
spine04           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:32 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:33 2020
Example show command with filtered output

The following example shows the filtered output for the netq show agents command:

cumulus@switch:~$ netq leaf01 show agents
Matching agents records:
Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
leaf01            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 16:49:04 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:26:33 2020

Configuration Commands

Various commands—including netq config, netq notification, and netq install—allow you to manage NetQ Agent and CLI server configurations, configure lifecycle management, set up container monitoring, and manage notifications.

NetQ Agent Configuration

The agent commands configure individual NetQ Agents.

The agent configuration commands can add and remove agents from switches and hosts, start and stop agent operations, debug the agent, specify default commands, and enable or disable a variety of monitoring features (including Kubernetes, sensors, FRR (FRRouting), CPU usage limit, and What Just Happened).

Commands apply to one agent at a time. Run them from the switch or host where the NetQ Agent resides.

The agent configuration commands include:

netq config (add|del|show) agent
netq config (start|stop|status|restart) agent

The following example shows how to configure the agent to send sensor data:

cumulus@switch~:$ netq config add agent sensors

The following example shows how to start monitoring with Kubernetes:

cumulus@switch:~$ netq config add agent kubernetes-monitor poll-period 15

The following example shows how to view the NetQ Agent configuration:

cumulus@switch:~$ netq config show agent
netq-agent             value      default
---------------------  ---------  ---------
enable-opta-discovery  True       True
exhibitport
agenturl
server                 127.0.0.1  127.0.0.1
exhibiturl
vrf                    default    default
agentport              8981       8981
port                   31980      31980

After making configuration changes to your agents, you must restart the agent for the changes to take effect. Use the netq config restart agent command.

Refer to Manage NetQ Agents and Install NetQ Agents for additional examples.

CLI Configuration

The netq config cli configures and manages the CLI component. You can add or remove the CLI (essentially enabling/disabling the service), start and restart it, and view the configuration of the service.

Commands apply to one device at a time, and you run them from the switch or host where you run the CLI.

The CLI configuration commands include:

netq config add cli server
netq config del cli server
netq config show cli premises [json]
netq config show (cli|all) [json]
netq config (status|restart) cli
netq config select cli premise

The following example shows how to restart the CLI instance:

cumulus@switch~:$ netq config restart cli

The following example shows how to enable the CLI on a NetQ on-premises appliance or virtual machine (VM):

cumulus@switch~:$ netq config add cli server 10.1.3.101

The following example shows how to enable the CLI on a NetQ Cloud Appliance or VM for the Chicago premises and the default port:

netq config add cli server api.netq.cumulusnetworks.com access-key <user-access-key> secret-key <user-secret-key> premises chicago port 443

NetQ System Configuration Commands

Use the following commands to manage the NetQ system itself:

The following example shows how to bootstrap a single server or master server in a server cluster:

cumulus@switch:~$ netq bootstrap master interface eth0 tarball /mnt/installables/netq-bootstrap-4.1.0.tgz

The following example shows how to decommission a switch named leaf01:

cumulus@netq-appliance:~$ netq decommission leaf01

For information and examples on installing and upgrading the NetQ system, see Install NetQ and Upgrade NetQ.

Event Notification Commands

The notification configuration commands can add, remove, and show notification application integrations. These commands create the channels, filters, and rules needed to control event messaging. The commands include:

netq (add|del|show) notification channel
netq (add|del|show) notification rule
netq (add|del|show) notification filter
netq (add|del|show) notification proxy

An integration includes at least one channel (PagerDuty, Slack, or syslog), at least one filter (defined by rules you create), and at least one rule.

The following example shows how to configure a PagerDuty channel:

cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key c6d666e210a8425298ef7abde0d1998
Successfully added/updated channel pd-netq-events

Refer to Configure System Event Notifications for additional examples.

Threshold-based Event Notification Commands

NetQ supports TCA events, a set of events that are triggered by crossing a user-defined threshold. Configure and manage TCA events using the following commands:

netq add tca [event_id <text-event-id-anchor>] [tca_id <text-tca-id-anchor>] [scope <text-scope-anchor>] [severity info | severity error] [is_active true | is_active false] [suppress_until <text-suppress-ts>] [threshold_type user_set | threshold_type vendor_set] [ threshold <text-threshold-value> ] [channel <text-channel-name-anchor> | channel drop <text-drop-channel-name>]
netq del tca tca_id <text-tca-id-anchor>
netq show tca [tca_id <text-tca-id-anchor>] [json]

Lifecycle Management Commands

The netq lcm lifecycle management commands help you efficiently manage the deployment of NVIDIA product software onto your network devices (servers, appliances, and switches).

LCM commands allow you to:

The following example shows the NetQ configuration profiles:

cumulus@switch:~$ netq lcm show netq-config
ID                        Name            Default Profile                VRF             WJH       CPU Limit Log Level Last Changed
------------------------- --------------- ------------------------------ --------------- --------- --------- --------- -------------------------
config_profile_3289efda36 NetQ default co Yes                            mgmt            Disable   Disable   info      Tue Apr 27 22:42:05 2021
db4065d56f91ebbd34a523b45 nfig
944fbfd10c5d75f9134d42023
eb2b

The following example shows how to add a Cumulus Linux installation image to the NetQ repository on the switch:

netq lcm add cl-image /path/to/download/cumulus-linux-4.3.0-mlnx-amd64.bin

Trace Commands

The trace commands lets you view the available paths between two nodes on the network currently and at a time in the past. You can perform a layer 2 or layer 3 trace, and view the output in one of three formats: JSON, pretty, and detail. JSON output provides the output in a JSON file format for ease of importing to other applications or software. Pretty output lines up the paths in a pseudo-graphical manner to help visualize multiple paths. Detail output is useful for traces with higher hop counts where the pretty output wraps lines, making it harder to interpret the results. The detail output displays a table with a row for each path.

The trace command syntax is:

netq trace <mac> [vlan <1-4096>] from (<src-hostname>|<ip-src>) [vrf <vrf>] [around <text-time>] [json|detail|pretty] [debug]
netq trace <ip> from (<src-hostname>|<ip-src>) [vrf <vrf>] [around <text-time>] [json|detail|pretty] [debug]
Example trace command with pretty output

The following example shows how to run a trace based on the destination IP address, in pretty output with a small number of resulting paths:

cumulus@switch:~$ netq trace 10.0.0.11 from 10.0.0.14 pretty
Number of Paths: 6
    Inconsistent PMTU among paths
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9000
    leaf04 swp52 -- swp4 spine02 swp2 -- swp52 leaf02 peerlink.4094 -- peerlink.4094 leaf01 lo
                                                    peerlink.4094 -- peerlink.4094 leaf01 lo
    leaf04 swp51 -- swp4 spine01 swp2 -- swp51 leaf02 peerlink.4094 -- peerlink.4094 leaf01 lo
                                                    peerlink.4094 -- peerlink.4094 leaf01 lo
    leaf04 swp52 -- swp4 spine02 swp1 -- swp52 leaf01 lo
    leaf04 swp51 -- swp4 spine01 swp1 -- swp51 leaf01 lo
Example trace command with detail output

This example shows how to run a trace based on the destination IP address, in detail output with a small number of resulting paths:

cumulus@switch:~$ netq trace 10.0.0.11 from 10.0.0.14 detail
Number of Paths: 6
    Inconsistent PMTU among paths
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9000
Id  Hop Hostname        InPort          InVlan InTunnel              InRtrIf         InVRF           OutRtrIf        OutVRF          OutTunnel             OutPort         OutVlan
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
1   1   leaf04                                                                                       swp52           default                               swp52
    2   spine02         swp4                                         swp4            default         swp2            default                               swp2
    3   leaf02          swp52                                        swp52           default         peerlink.4094   default                               peerlink.4094
    4   leaf01          peerlink.4094                                peerlink.4094   default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
2   1   leaf04                                                                                       swp52           default                               swp52
    2   spine02         swp4                                         swp4            default         swp2            default                               swp2
    3   leaf02          swp52                                        swp52           default         peerlink.4094   default                               peerlink.4094
    4   leaf01          peerlink.4094                                peerlink.4094   default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
3   1   leaf04                                                                                       swp51           default                               swp51
    2   spine01         swp4                                         swp4            default         swp2            default                               swp2
    3   leaf02          swp51                                        swp51           default         peerlink.4094   default                               peerlink.4094
    4   leaf01          peerlink.4094                                peerlink.4094   default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
4   1   leaf04                                                                                       swp51           default                               swp51
    2   spine01         swp4                                         swp4            default         swp2            default                               swp2
    3   leaf02          swp51                                        swp51           default         peerlink.4094   default                               peerlink.4094
    4   leaf01          peerlink.4094                                peerlink.4094   default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
5   1   leaf04                                                                                       swp52           default                               swp52
    2   spine02         swp4                                         swp4            default         swp1            default                               swp1
    3   leaf01          swp52                                        swp52           default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
6   1   leaf04                                                                                       swp51           default                               swp51
    2   spine01         swp4                                         swp4            default         swp1            default                               swp1
    3   leaf01          swp51                                        swp51           default                                                               lo
--- --- --------------- --------------- ------ --------------------- --------------- --------------- --------------- --------------- --------------------- --------------- -------
Example trace command on destination MAC address

This example shows how to run a trace based on the destination MAC address, in pretty output:

cumulus@switch:~$ netq trace A0:00:00:00:00:11 vlan 1001 from Server03 pretty
Number of Paths: 6
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9152
    
    Server03 bond1.1001 -- swp7 <vlan1001> Leaf02 vni: 34 swp5 -- swp4 Spine03 swp7 -- swp5 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>
                                                        swp4 -- swp4 Spine02 swp7 -- swp4 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>
                                                        swp3 -- swp4 Spine01 swp7 -- swp3 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>
            bond1.1001 -- swp7 <vlan1001> Leaf01 vni: 34 swp5 -- swp3 Spine03 swp7 -- swp5 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>
                                                        swp4 -- swp3 Spine02 swp7 -- swp4 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>
                                                        swp3 -- swp3 Spine01 swp7 -- swp3 vni: 34 Leaf04 swp6 -- swp1.1001 Server03 <swp1.1001>

Installation Management

This section describes how to install, configure, and upgrade NetQ.

Before you begin, review the release notes for this version.

Before You Install

This overview is designed to help you understand the various NetQ deployment and installation options.

Installation Overview

Consider the following before you install the NetQ system:

  1. Determine whether to deploy the solution fully on premises or as a remote solution.
  2. Decide whether to deploy a virtual machine on your own hardware or use one of the NetQ appliances.
  3. Choose whether to install the software on a single server or as a server cluster.

The following decision tree reflects these steps:

NetQ system deployment options

Deployment Type: On Premises or Remote

You can deploy NetQ in one of two ways.

With either deployment model, the NetQ Agents reside on the switches and hosts they monitor in your network.

System: Virtual Machine or NetQ Appliances

The next installation consideration is whether you plan to use NetQ Cloud Appliances or your own servers with VMs. Both options provide the same services and features. The difference is in the implementation. When you install NetQ software on your own hardware, you create and maintain a KVM or VMware VM, and the software runs from there. This requires you to scope and order an appropriate hardware server to support the NetQ requirements, but might allow you to reuse an existing server in your stock.

When you choose to purchase and install NetQ Cloud Appliances, the initial configuration of the server with Ubuntu OS is already done for you, and the NetQ software components are pre-loaded, saving you time during the physical deployment.

Data Flow

The flow of data differs based on your deployment model.

For the on-premises deployment, the NetQ Agents collect and transmit data from the switches and hosts back to the NetQ On-premises Appliance or virtual machine running the NetQ Platform software, which in turn processes and stores the data in its database. This data is then displayed through the user interface.

on-premises deployment type displaying data transmission between the agents, the platform, and the user interface.

For the remote, multi-site NetQ implementation, the NetQ Agents at each premises collect and transmit data from the switches and hosts at that premises to its NetQ Cloud Appliance or virtual machine running the NetQ Collector software. The NetQ Collectors then transmit this data to the common NetQ Cloud Appliance or virtual machine and database at one of your premises for processing and storage.

For the remote, cloud-service implementation, the NetQ Agents collect and transmit data from the switches and hosts to the NetQ Cloud Appliance or virtual machine running the NetQ Collector software. The NetQ Collector then transmits this data to the NVIDIA cloud-based infrastructure for further processing and storage.

For either remote solution, telemetry data is displayed through the same user interfaces as the on-premises solution. When using the cloud service implementation of the remote solution, the browser interface can be pointed to the local NetQ Cloud Appliance or VM, or directly to netq.nvidia.com.

Server Arrangement: Single or Cluster

The next installation step is deciding whether to deploy a single server or a server cluster. Both options provide the same services and features. The biggest difference is the number of servers deployed and the continued availability of services running on those servers should hardware failures occur.

A single server is easier to set up, configure and manage, but can limit your ability to scale your network monitoring quickly. Deploying multiple servers is a bit more complicated, but you limit potential downtime and increase availability by having more than one server that can run the software and store the data. Select the standalone single-server arrangements for smaller, simpler deployments. Be sure to consider the capabilities and resources needed on this server to support the size of your final deployment.

Select the server cluster arrangement to obtain scalability and high availability for your network. The default clustering implementation has three servers: 1 master and 2 workers. However, NetQ supports up to 10 worker nodes in a cluster. When you configure the cluster, configure the NetQ Agents to connect to these three nodes in the cluster first by providing the IP addresses as a comma-separated list. If you decide to add additional nodes to the cluster, you do not need to configure these nodes again.

Cluster Deployments and Kubernetes

NetQ also monitors Kubernetes containers. If the master node ever goes down, all NetQ services should continue to work. However, keep in mind that the master hosts the Kubernetes control plane so anything that requires connectivity with the Kubernetes cluster—such as upgrading NetQ or rescheduling pods to other workers if a worker goes down—will not work.

Cluster Deployments and Load Balancers

You need a load balancer for high availability for the NetQ API and the NetQ UI.

However, you need to be mindful of where you install the certificates for the NetQ UI (port 443); otherwise, you cannot access the NetQ UI.

If you are using a load balancer in your deployment, we recommend you install the certificates directly on the load balancer for SSL offloading. However, if you install the certificates on the master node, then configure the load balancer to allow for SSL passthrough.

Where to Go Next

After you’ve decided on your deployment type, you’re ready to install NetQ.

Install NetQ

The following sections provides installation instruction for the NetQ system and software. To install NetQ:

  1. Visit Before You Install to understand the various NetQ deployments.

  2. After deciding your deployment model, prepare your devices and install NetQ.

  3. Next, install and configure the NetQ Agents on switches and hosts.

  4. Finally, install and configure the NetQ CLI on switches and hosts.

Install the NetQ System

The following table lists the various ways to install NetQ. If you are unsure which option is best for your network, refer to Before You Install.

Deployment TypeServer ArrangementSystemHypervisorInstallation Instructions
On premisesSingle serverNVIDIA NetQ ApplianceN/AStart Install
On premisesSingle serverOwn hardware plus VMKVMStart Install
On premisesSingle serverOwn hardware plus VMVMwareStart Install
On premisesServer clusterNVIDIA NetQ ApplianceN/AStart Install
On premisesServer clusterOwn hardware plus VMKVMStart Install
On premisesServer clusterOwn hardware plus VMVMwareStart Install
CloudSingle serverNVIDIA NetQ Cloud ApplianceN/AStart Install
CloudSingle serverOwn hardware plus VMKVMStart Install
CloudSingle serverOwn hardware plus VMVMwareStart Install
CloudServer clusterNVIDIA NetQ Cloud ApplianceN/AStart Install
CloudServer clusterOwn hardware plus VMKVMStart Install
CloudServer clusterOwn hardware plus VMVMwareStart Install

Set Up Your VMware Virtual Machine for a Single On-premises Server

Follow these steps to setup and configure your VM on a single server in an on-premises deployment:

  1. Verify that your system meets the VM requirements.

    ResourceMinimum Requirements
    ProcessorSixteen (16) virtual CPUs
    Memory64 GB RAM
    Local disk storage500 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
    (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
    Network interface speed1 Gb NIC
    HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu, and RedHat operating systems
  2. Confirm that the needed ports are open for communications.

    You must open the following ports on your NetQ on-premises server:
    Port or Protocol NumberProtocolComponent Access
    4IP ProtocolCalico networking (IP-in-IP Protocol)
    22TCPSSH
    80TCPNginx
    179TCPCalico networking (BGP)
    443TCPNetQ UI
    2379TCPetcd datastore
    4789UDPCalico networking (VxLAN)
    5000TCPDocker registry
    6443TCPkube-apiserver
    30001TCPDPU communication
    31980TCPNetQ Agent communication
    31982TCPNetQ Agent SSL communication
    32708TCPAPI Gateway
  3. Download the NetQ Platform image.

    1. On the NVIDIA Application Hub, log in to your account.
    2. Select NVIDIA Licensing Portal.
    3. Select Software Downloads from the menu.
    4. Click Product Family and select NetQ.
    5. Locate the NetQ SW 4.4 VMWare image and select Download.
    6. If prompted, read the license agreement and proceed with the download.

    For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


    For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

  4. Setup and configure your VM.

    VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
    1. Enter the address of the hardware in your browser.

    2. Log in to VMware using credentials with root access.

    3. Click Storage in the Navigator to verify you have an SSD installed.

    4. Click Create/Register VM at the top of the right pane.

    5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

    6. Provide a name for the VM, for example NetQ.

      Tip: Make note of the name used during install as this is needed in a later step.

    7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

  5. Click Next.

  6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

  7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

  8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

    The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

  9. Once completed, view the full details of the VM and hardware.

  • Log in to the VM and change the password.

    Use the default credentials to log in the first time:

    • Username: cumulus
    • Password: cumulus
    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
    You are required to change your password immediately (root enforced)
    System information as of Thu Dec  3 21:35:42 UTC 2020
    System load:  0.09              Processes:           120
    Usage of /:   8.1% of 61.86GB   Users logged in:     0
    Memory usage: 5%                IP address for eth0: <ipaddr>
    Swap usage:   0%
    WARNING: Your password has expired.
    You must change your password now and login again!
    Changing password for cumulus.
    (current) UNIX password: cumulus
    Enter new UNIX password:
    Retype new UNIX password:
    passwd: password updated successfully
    Connection to <ipaddr> closed.
    

    Log in again with your new password.

    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
      System information as of Thu Dec  3 21:35:59 UTC 2020
      System load:  0.07              Processes:           121
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
    Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
    cumulus@ubuntu:~$
    
  • Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • The final step is to install and activate the NetQ software using the CLI:

  • Run the following command on your NetQ platform server or NetQ Appliance:

    cumulus@hostname:~$ netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your VMware Virtual Machine for a Single Cloud Server

    Follow these steps to setup and configure your VM for a cloud deployment:

    1. Verify that your system meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storage64 GB
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises server:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 VMWare Cloud image and select Download.
      6. If prompted, read the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Log in to the VM and change the password.

    Use the default credentials to log in the first time:

    • Username: cumulus
    • Password: cumulus
    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
    You are required to change your password immediately (root enforced)
    System information as of Thu Dec  3 21:35:42 UTC 2020
    System load:  0.09              Processes:           120
    Usage of /:   8.1% of 61.86GB   Users logged in:     0
    Memory usage: 5%                IP address for eth0: <ipaddr>
    Swap usage:   0%
    WARNING: Your password has expired.
    You must change your password now and login again!
    Changing password for cumulus.
    (current) UNIX password: cumulus
    Enter new UNIX password:
    Retype new UNIX password:
    passwd: password updated successfully
    Connection to <ipaddr> closed.
    

    Log in again with your new password.

    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
      System information as of Thu Dec  3 21:35:59 UTC 2020
      System load:  0.07              Processes:           121
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
    Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
    cumulus@ubuntu:~$
    
  • Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • The final step is to install and activate the NetQ software using the CLI:

  • Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
    

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your VMware Virtual Machine for an On-premises Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM cluster for an on-premises deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorSixteen (16) virtual CPUs
      Memory64 GB RAM
      Local disk storage500 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
      (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises servers:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client
      36443TCPKubernetes control plane
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 VMWare image and select Download.
      6. If prompted, read the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Log in to the VM and change the password.

    Use the default credentials to log in the first time:

    • Username: cumulus
    • Password: cumulus
    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
    You are required to change your password immediately (root enforced)
    System information as of Thu Dec  3 21:35:42 UTC 2020
    System load:  0.09              Processes:           120
    Usage of /:   8.1% of 61.86GB   Users logged in:     0
    Memory usage: 5%                IP address for eth0: <ipaddr>
    Swap usage:   0%
    WARNING: Your password has expired.
    You must change your password now and login again!
    Changing password for cumulus.
    (current) UNIX password: cumulus
    Enter new UNIX password:
    Retype new UNIX password:
    passwd: password updated successfully
    Connection to <ipaddr> closed.
    

    Log in again with your new password.

    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
      System information as of Thu Dec  3 21:35:59 UTC 2020
      System load:  0.07              Processes:           121
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
    Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
    cumulus@ubuntu:~$
    
  • Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Verify that your first worker node meets the VM requirements, as described in Step 1.

  • Confirm that the needed ports are open for communications, as described in Step 2.

  • Open your hypervisor and set up the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. You need it for later installation steps.

  • Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Repeat Steps 8 through 11 for each additional worker node you want in your cluster.

  • The final step is to install and activate the NetQ software using the CLI:

  • Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
    

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following commands on your master node, using the IP addresses of your worker nodes:

    cumulus@<hostname>:~$ netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz workers <worker-1-ip> <worker-2-ip>

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your VMware Virtual Machine for a Cloud Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storage64 GB
      Network interface speed1 Gb NIC
      HypervisorVMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises servers:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client
      36443TCPKubernetes control plane
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 VMWare Cloud image and select Download.
      6. If prompted, read the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      VMware Example Configuration This example shows the VM setup process using an OVA file with VMware ESXi.
      1. Enter the address of the hardware in your browser.

      2. Log in to VMware using credentials with root access.

      3. Click Storage in the Navigator to verify you have an SSD installed.

      4. Click Create/Register VM at the top of the right pane.

      5. Select Deploy a virtual machine from an OVF or OVA file, and click Next.

      6. Provide a name for the VM, for example NetQ.

        Tip: Make note of the name used during install as this is needed in a later step.

      7. Drag and drop the NetQ Platform image file you downloaded in Step 2 above.

    5. Click Next.

    6. Select the storage type and data store for the image to use, then click Next. In this example, only one is available.

    7. Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.

    8. Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.

      The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.

    9. Once completed, view the full details of the VM and hardware.

  • Log in to the VM and change the password.

    Use the default credentials to log in the first time:

    • Username: cumulus
    • Password: cumulus
    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
    You are required to change your password immediately (root enforced)
    System information as of Thu Dec  3 21:35:42 UTC 2020
    System load:  0.09              Processes:           120
    Usage of /:   8.1% of 61.86GB   Users logged in:     0
    Memory usage: 5%                IP address for eth0: <ipaddr>
    Swap usage:   0%
    WARNING: Your password has expired.
    You must change your password now and login again!
    Changing password for cumulus.
    (current) UNIX password: cumulus
    Enter new UNIX password:
    Retype new UNIX password:
    passwd: password updated successfully
    Connection to <ipaddr> closed.
    

    Log in again with your new password.

    $ ssh cumulus@<ipaddr>
    Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
    Ubuntu 20.04 LTS
    cumulus@<ipaddr>'s password:
      System information as of Thu Dec  3 21:35:59 UTC 2020
      System load:  0.07              Processes:           121
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
    Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
    cumulus@ubuntu:~$
    
  • Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Change the hostname for the VM from the default value.

    The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

    Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

    The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

    Use the following command:

    cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

    Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

    127.0.0.1 localhost NEW_HOSTNAME
  • Verify that your first worker node meets the VM requirements, as described in Step 1.

  • Confirm that the needed ports are open for communications, as described in Step 2.

  • Open your hypervisor and set up the VM in the same manner as for the master node.

    Make a note of the private IP address you assign to the worker node. You need it for later installation steps.

  • Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

    cumulus@hostname:~$ sudo opta-check-cloud
  • Repeat Steps 8 through 11 for each additional worker node you want in your cluster.

  • The final step is to install and activate the NetQ software using the CLI:

  • Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
        

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> workers <worker-1-ip> <worker-2-ip> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
        

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your KVM Virtual Machine for a Single On-premises Server

    Follow these steps to set up and configure your VM on a single server in an on-premises deployment:

    1. Verify that your system meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorSixteen (16) virtual CPUs
      Memory64 GB RAM
      Local disk storage500 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
      (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
      Network interface speed1 Gb NIC
      HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the required ports are open for communications.

      You must open the following ports on your NetQ on-premises server:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 KVM image and select Download.
      6. If prompted, agree to the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Set up and configure your VM.

      KVM Example Configuration

      This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

      1. Confirm that the SHA256 checksum matches the one posted on the NVIDIA Application Hub to ensure the image download has not been corrupted.

        $ sha256sum ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2
        $ 0A00383666376471A8190E2367B27068B81D6EE00FDE885C68F4E3B3025A00B6 ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2
      2. Copy the QCOW2 image to a directory where you want to run it.

        Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

        $ sudo mkdir /vms
        $ sudo cp ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2 /vms/ts.qcow2
      3. Create the VM.

        For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

        $ virt-install --name=netq_ts --vcpus=16 --memory=65536 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

        Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

        Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

        $ virt-install --name=netq_ts --vcpus=16 --memory=65536 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

        Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

        Make note of the name used during install as this is needed in a later step.

      4. Watch the boot process in another terminal window.
        $ virsh console netq_ts
    5. Log into the VM and change the password.

      Use the default credentials to log in the first time:

      • Username: cumulus
      • Password: cumulus
      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
      You are required to change your password immediately (root enforced)
      System information as of Thu Dec  3 21:35:42 UTC 2020
      System load:  0.09              Processes:           120
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
      WARNING: Your password has expired.
      You must change your password now and login again!
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      Connection to <ipaddr> closed.
      

      Log in again with your new password.

      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
        System information as of Thu Dec  3 21:35:59 UTC 2020
        System load:  0.07              Processes:           121
        Usage of /:   8.1% of 61.86GB   Users logged in:     0
        Memory usage: 5%                IP address for eth0: <ipaddr>
        Swap usage:   0%
      Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
      cumulus@ubuntu:~$
      
    6. Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    7. Change the hostname for the VM from the default value.

      The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

      Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

      127.0.0.1 localhost NEW_HOSTNAME
    8. The final step is to install and activate the NetQ software:

    Run the following command on your NetQ platform server or NetQ Appliance:

    cumulus@hostname:~$ netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your KVM Virtual Machine for a Single Cloud Server

    Follow these steps to setup and configure your VM on a single server in a cloud deployment:

    1. Verify that your system meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storage64 GB
      Network interface speed1 Gb NIC
      HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises server:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
    3. Download the NetQ images.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 KVM Cloud image and select Download.
      6. If prompted, agree to the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      KVM Example Configuration

      This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

      1. Confirm that the SHA256 checksum matches the one posted on the NVIDIA Application Hub to ensure the image download has not been corrupted.

        $ sha256sum ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2
        $ FE353FC06D3F843F4041D74C853D38B0A56036C5886F6233A3ED1A9464AEB783 ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2
      2. Copy the QCOW2 image to a directory where you want to run it.

        Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

        $ sudo mkdir /vms
        $ sudo cp ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2 /vms/ts.qcow2
      3. Create the VM.

        For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

        $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

        Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

        Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

        $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

        Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

        Make note of the name used during install as this is needed in a later step.

      4. Watch the boot process in another terminal window.
        $ virsh console netq_ts
    5. Log in to the VM and change the password.

      Use the default credentials to log in the first time:

      • Username: cumulus
      • Password: cumulus
      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
      You are required to change your password immediately (root enforced)
      System information as of Thu Dec  3 21:35:42 UTC 2020
      System load:  0.09              Processes:           120
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
      WARNING: Your password has expired.
      You must change your password now and login again!
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      Connection to <ipaddr> closed.
      

      Log in again with your new password.

      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
        System information as of Thu Dec  3 21:35:59 UTC 2020
        System load:  0.07              Processes:           121
        Usage of /:   8.1% of 61.86GB   Users logged in:     0
        Memory usage: 5%                IP address for eth0: <ipaddr>
        Swap usage:   0%
      Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
      cumulus@ubuntu:~$
      
    6. Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    7. Change the hostname for the VM from the default value.

      The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

      Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

      127.0.0.1 localhost NEW_HOSTNAME
    8. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
    

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your KVM Virtual Machine for an On-premises Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM on a cluster of servers in an on-premises deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorSixteen (16) virtual CPUs
      Memory64 GB RAM
      Local disk storage500 GB SSD with minimum disk IOPS of 1000 for a standard 4kb block size
      (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
      Network interface speed1 Gb NIC
      HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises servers:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client
      36443TCPKubernetes control plane
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 KVM image and select Download.
      6. If prompted, agree to the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      KVM Example Configuration

      This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

      1. Confirm that the SHA256 checksum matches the one posted on the NVIDIA Application Hub to ensure the image download has not been corrupted.

        $ sha256sum ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2
        $ 0A00383666376471A8190E2367B27068B81D6EE00FDE885C68F4E3B3025A00B6 ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2
      2. Copy the QCOW2 image to a directory where you want to run it.

        Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

        $ sudo mkdir /vms
        $ sudo cp ./Downloads/netq-4.4.0-ubuntu-18.04-ts-qemu.qcow2 /vms/ts.qcow2
      3. Create the VM.

        For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

        $ virt-install --name=netq_ts --vcpus=16 --memory=65536 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

        Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

        Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

        $ virt-install --name=netq_ts --vcpus=16 --memory=65536 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

        Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

        Make note of the name used during install as this is needed in a later step.

      4. Watch the boot process in another terminal window.
        $ virsh console netq_ts
    5. Log in to the VM and change the password.

      Use the default credentials to log in the first time:

      • Username: cumulus
      • Password: cumulus
      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
      You are required to change your password immediately (root enforced)
      System information as of Thu Dec  3 21:35:42 UTC 2020
      System load:  0.09              Processes:           120
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
      WARNING: Your password has expired.
      You must change your password now and login again!
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      Connection to <ipaddr> closed.
      

      Log in again with your new password.

      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
        System information as of Thu Dec  3 21:35:59 UTC 2020
        System load:  0.07              Processes:           121
        Usage of /:   8.1% of 61.86GB   Users logged in:     0
        Memory usage: 5%                IP address for eth0: <ipaddr>
        Swap usage:   0%
      Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
      cumulus@ubuntu:~$
      
    6. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    7. Change the hostname for the VM from the default value.

      The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

      Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

      127.0.0.1 localhost NEW_HOSTNAME
    8. Verify that your first worker node meets the VM requirements, as described in Step 1.

    9. Confirm that the needed ports are open for communications, as described in Step 2.

    10. Open your hypervisor and set up the VM in the same manner as for the master node.

      Make a note of the private IP address you assign to the worker node. You need it for later installation steps.

    11. Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    12. Repeat Steps 8 through 11 for each additional worker node you want in your cluster.

    13. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
    

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following commands on your master node, using the IP addresses of your worker nodes:

    cumulus@<hostname>:~$ netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz workers <worker-1-ip> <worker-2-ip>

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Set Up Your KVM Virtual Machine for a Cloud Server Cluster

    First configure the VM on the master node, and then configure the VM on each worker node.

    Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:

    1. Verify that your master node meets the VM requirements.

      ResourceMinimum Requirements
      ProcessorFour (4) virtual CPUs
      Memory8 GB RAM
      Local disk storage64 GB
      Network interface speed1 Gb NIC
      HypervisorKVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu, and RedHat operating systems
    2. Confirm that the needed ports are open for communications.

      You must open the following ports on your NetQ on-premises servers:
      Port or Protocol NumberProtocolComponent Access
      4IP ProtocolCalico networking (IP-in-IP Protocol)
      22TCPSSH
      80TCPNginx
      179TCPCalico networking (BGP)
      443TCPNetQ UI
      2379TCPetcd datastore
      4789UDPCalico networking (VxLAN)
      5000TCPDocker registry
      6443TCPkube-apiserver
      30001TCPDPU communication
      31980TCPNetQ Agent communication
      31982TCPNetQ Agent SSL communication
      32708TCPAPI Gateway
      Additionally, for internal cluster communication, you must open these ports:
      PortProtocolComponent Access
      8080TCPAdmin API
      5000TCPDocker registry
      6443TCPKubernetes API server
      10250TCPkubelet health probe
      2379TCPetcd
      2380TCPetcd
      7072TCPKafka JMX monitoring
      9092TCPKafka client
      7071TCPCassandra JMX monitoring
      7000TCPCassandra cluster communication
      9042TCPCassandra client
      7073TCPZookeeper JMX monitoring
      2888TCPZookeeper cluster communication
      3888TCPZookeeper cluster communication
      2181TCPZookeeper client
      36443TCPKubernetes control plane
    3. Download the NetQ Platform image.

      1. On the NVIDIA Application Hub, log in to your account.
      2. Select NVIDIA Licensing Portal.
      3. Select Software Downloads from the menu.
      4. Click Product Family and select NetQ.
      5. Locate the NetQ SW 4.4 KVM Cloud image and select Download.
      6. If prompted, agree to the license agreement and proceed with the download.

      For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


      For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

    4. Setup and configure your VM.

      KVM Example Configuration

      This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.

      1. Confirm that the SHA256 checksum matches the one posted on the NVIDIA Application Hub to ensure the image download has not been corrupted.

        $ sha256sum ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2
        $ FE353FC06D3F843F4041D74C853D38B0A56036C5886F6233A3ED1A9464AEB783 ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2
      2. Copy the QCOW2 image to a directory where you want to run it.

        Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.

        $ sudo mkdir /vms
        $ sudo cp ./Downloads/netq-4.0.0-ubuntu-18.04-tscloud-qemu.qcow2 /vms/ts.qcow2
      3. Create the VM.

        For a Direct VM, where the VM uses a MACVLAN interface to sit on the host interface for its connectivity:

        $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=type=direct,source=eth0,model=virtio --import --noautoconsole

        Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.

        Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:

        $ virt-install --name=netq_ts --vcpus=4 --memory=8192 --os-type=linux --os-variant=generic \ --disk path=/vms/ts.qcow2,format=qcow2,bus=virtio,cache=none --network=bridge=br0,model=virtio --import --noautoconsole

        Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.

        Make note of the name used during install as this is needed in a later step.

      4. Watch the boot process in another terminal window.
        $ virsh console netq_ts
    5. Log in to the VM and change the password.

      Use the default credentials to log in the first time:

      • Username: cumulus
      • Password: cumulus
      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
      You are required to change your password immediately (root enforced)
      System information as of Thu Dec  3 21:35:42 UTC 2020
      System load:  0.09              Processes:           120
      Usage of /:   8.1% of 61.86GB   Users logged in:     0
      Memory usage: 5%                IP address for eth0: <ipaddr>
      Swap usage:   0%
      WARNING: Your password has expired.
      You must change your password now and login again!
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      Connection to <ipaddr> closed.
      

      Log in again with your new password.

      $ ssh cumulus@<ipaddr>
      Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
      Ubuntu 20.04 LTS
      cumulus@<ipaddr>'s password:
        System information as of Thu Dec  3 21:35:59 UTC 2020
        System load:  0.07              Processes:           121
        Usage of /:   8.1% of 61.86GB   Users logged in:     0
        Memory usage: 5%                IP address for eth0: <ipaddr>
        Swap usage:   0%
      Last login: Thu Dec  3 21:35:43 2020 from <local-ipaddr>
      cumulus@ubuntu:~$
      
    6. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    7. Change the hostname for the VM from the default value.

      The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME

      Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:

      127.0.0.1 localhost NEW_HOSTNAME
    8. Verify that your first worker node meets the VM requirements, as described in Step 1.

    9. Confirm that the needed ports are open for communications, as described in Step 2.

    10. Open your hypervisor and set up the VM in the same manner as for the master node.

      Make a note of the private IP address you assign to the worker node. You need it for later installation steps.

    11. Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    12. Repeat Steps 8 through 11 for each additional worker node you want in your cluster.

    13. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
        

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> workers <worker-1-ip> <worker-2-ip> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
        

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Install the NetQ On-premises Appliance

    This topic describes how to prepare your single, NetQ On-premises Appliance for installation of the NetQ Platform software.

    Each system shipped to you contains:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.

    Install the Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for the appliance before installing the NetQ software.

    1. Log in to the appliance using the default login credentials:

      • Username: cumulus
      • Password: cumulus
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      

    Verify NetQ Software and Appliance Readiness

    Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.

    1. Verify that the needed packages are present and of the correct release, version 4.4 and update 38.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 4.4.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0.tgz
    3. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    4. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your NetQ platform server or NetQ Appliance:

    cumulus@hostname:~$ netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Install the NetQ Cloud Appliance

    This topic describes how to prepare your single, NetQ Cloud Appliance for installation of the NetQ Collector software.

    Each system shipped to you contains:

    If you’re looking for hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, check out the appliance’s user manual.

    Install the Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    1. Log in to the appliance using the default login credentials:

      • Username: cumulus
      • Password: cumulus
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      

    Verify NetQ Software and Appliance Readiness

    Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.

    1. Verify that the required packages are present and reflect the most current version.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and reflect the most current version.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0-opta.tgz
    3. Verify the appliance is ready for installation. Fix any errors before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    4. Install and activate the NetQ software using the CLI:

    Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
    

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Install a NetQ On-premises Appliance Cluster

    This topic describes how to prepare your cluster of NetQ On-premises Appliances for installation of the NetQ Platform software.

    Each system shipped to you contains:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.

    Install Each Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.

    1. Log in to the appliance that you intend to use as your master node using the default login credentials:

      • Username: cumulus
      • Password: cumulus
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      
    5. Repeat these steps for each of the worker node appliances.

    Verify NetQ Software and Appliance Readiness

    Now that the appliances are up and running, verify that the software is available and the appliance is ready for installation.

    1. On the master node, verify that the needed packages are present and of the correct release, version 4.4.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 4.4.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0.tgz
    3. Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    4. On one or your worker nodes, verify that the needed packages are present and of the correct release, version 4.4 and update 38 or later.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    5. Configure the IP address, hostname, and password using the same steps as for the master node. Refer to Configure the Password, Hostname and IP Address.

      Make a note of the private IP addresses you assign to the master and worker nodes. You need them for later installation steps.

    6. Verify that the needed packages are present and of the correct release, version 4.4 and update 38.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    7. Verify that the needed files are present and of the correct release.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0.tgz
    8. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check
    9. Repeat Steps 4-9 for each additional worker node (NetQ On-premises Appliance).

    10. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
    

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following commands on your master node, using the IP addresses of your worker nodes:

    cumulus@<hostname>:~$ netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0.tgz workers <worker-1-ip> <worker-2-ip>

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:

    Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.

    cumulus@hostname:~$ netq bootstrap reset [purge-db|keep-db]

    Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.4.0.tgz

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Install a NetQ Cloud Appliance Cluster

    This topic describes how to prepare your cluster of NetQ Cloud Appliances for installation of the NetQ Collector software.

    Each system shipped to you contains:

    For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, refer to the user manual.

    Install Each Appliance

    After you unbox the appliance:
    1. Mount the appliance in the rack.
    2. Connect it to power following the procedures described in your appliance's user manual.
    3. Connect the Ethernet cable to the 1G management port (eno1).
    4. Power on the appliance.

    If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.

    Configure the Password, Hostname and IP Address

    Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.

    1. Log in to the appliance that you intend to use as your master node using the default login credentials:

      • Username: cumulus
      • Password: cumulus
    2. Change the password using the passwd command:

      cumulus@hostname:~$ passwd
      Changing password for cumulus.
      (current) UNIX password: cumulus
      Enter new UNIX password:
      Retype new UNIX password:
      passwd: password updated successfully
      
    3. The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.

      Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.

      The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').

      Use the following command:

      cumulus@hostname:~$ sudo hostnamectl set-hostname NEW_HOSTNAME
      
    4. Identify the IP address.

      The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:

      cumulus@hostname:~$ ip -4 -brief addr show eno1
      eno1             UP             10.20.16.248/24
      

      Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.

      For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:

      # This file describes the network interfaces available on your system
      # For more information, see netplan(5).
      network:
          version: 2
          renderer: networkd
          ethernets:
              eno1:
                  dhcp4: no
                  addresses: [192.168.1.222/24]
                  gateway4: 192.168.1.1
                  nameservers:
                      addresses: [8.8.8.8,8.8.4.4]
      

      Apply the settings.

      cumulus@hostname:~$ sudo netplan apply
      
    5. Repeat these steps for each of the worker node appliances.

    Verify NetQ Software and Appliance Readiness

    Now that the appliances are up and running, verify that the software is available and each appliance is ready for installation.

    1. On the master NetQ Cloud Appliance, verify that the needed packages are present and of the correct release, version 4.4.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    2. Verify the installation images are present and of the correct release, version 4.4.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0-opta.tgz
    3. Verify the master NetQ Cloud Appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    4. On one of your worker NetQ Cloud Appliances, verify that the needed packages are present and of the correct release, version 4.4 and update 34.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    5. Configure the IP address, hostname, and password using the same steps as for the master node. Refer to Configure the Password, Hostname, and IP Address.

      Make a note of the private IP addresses you assign to the master and worker nodes. You need them for later installation steps.

    6. Verify that the needed packages are present and of the correct release, version 4.4.

      cumulus@hostname:~$ dpkg -l | grep netq
      ii  netq-agent   4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Telemetry Agent for Ubuntu
      ii  netq-apps    4.4.0-ub18.04u40~1667493385.97ef4c9_amd64  Cumulus NetQ Fabric Validation Application for Ubuntu
    7. Verify that the needed files are present and of the correct release.

      cumulus@hostname:~$ cd /mnt/installables/
      cumulus@hostname:/mnt/installables$ ls
      NetQ-4.4.0-opta.tgz
    8. Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.

      cumulus@hostname:~$ sudo opta-check-cloud
    9. Repeat Steps 4-8 for each additional worker NetQ Cloud Appliance.

    10. The final step is to install and activate the NetQ software using the CLI:

    Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:

    cumulus@<hostname>:~$ netq install cluster master-init
        Please run the following command on all worker nodes:
        netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
        

    Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.

    Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.

    cumulus@<hostname>:~$ netq install opta cluster full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key <your-config-key> workers <worker-1-ip> <worker-2-ip> [proxy-host <proxy-hostname> proxy-port <proxy-port>]
        

    You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.

    If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:

    Reset the VM:

    cumulus@hostname:~$ netq bootstrap reset

    Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  [proxy-host  proxy-port ]

    If this step fails for any reason, you can run netq bootstrap reset and then try again.

    Consider the following for container environments, and make adjustments as needed.

    Flannel Virtual Networks

    If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.

    The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.

    To change the default address range, use the install CLI with the pod-ip-range option. For example:

    cumulus@hostname:~$ netq install opta standalone full interface eth0 bundle /mnt/installables/NetQ-4.4.0-opta.tgz config-key  pod-ip-range 10.255.0.0/16

    Docker Default Bridge Interface

    The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.

    Verify Installation Status

    To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:

    State: Active
        Version: 4.4.0
        Installer Version: 4.4.0
        Installation Type: Standalone
        Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
        Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
        Is Cloud: False
        
        Cluster Status:
        IP Address     Hostname       Role    Status
        -------------  -------------  ------  --------
        10.188.44.147  10.188.44.147  Role    Ready
        
        NetQ... Active
        

    Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.

    cumulus@hostname:~$ netq show opta-health
        Application                                            Status    Namespace      Restarts    Timestamp
        -----------------------------------------------------  --------  -------------  ----------  ------------------------
        cassandra-rc-0-w7h4z                                   READY     default        0           Fri Apr 10 16:08:38 2020
        cp-schema-registry-deploy-6bf5cbc8cc-vwcsx             READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-broker-rc-0-p9r2l                                READY     default        0           Fri Apr 10 16:08:38 2020
        kafka-connect-deploy-7799bcb7b4-xdm5l                  READY     default        0           Fri Apr 10 16:08:38 2020
        netq-api-gateway-deploy-55996ff7c8-w4hrs               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-address-deploy-66776ccc67-phpqk               READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-admin-oob-mgmt-server                         READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-bgp-deploy-7dd4c9d45b-j9bfr                   READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-clagsession-deploy-69564895b4-qhcpr           READY     default        0           Fri Apr 10 16:08:38 2020
        netq-app-configdiff-deploy-ff54c4cc4-7rz66             READY     default        0           Fri Apr 10 16:08:38 2020
        ...
        

    If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.

    After NetQ is installed, you can log in to NetQ from your browser.

    Install NetQ Agents

    After installing the NetQ software, you should install the NetQ Agents on each switch you want to monitor. You can install NetQ Agents on switches and servers running:

    Prepare for NetQ Agent Installation

    For switches running Cumulus Linux and SONiC, you need to:

    For servers running RHEL, CentOS, or Ubuntu, you need to:

    If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.

    Verify NTP Is Installed and Configured

    Verify that NTP is running on the switch. The switch must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

    cumulus@switch:~$ sudo systemctl status ntp
    [sudo] password for cumulus:
    ● ntp.service - LSB: Start NTP daemon
            Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
            Active: active (running) since Fri 2018-06-01 13:49:11 EDT; 2 weeks 6 days ago
              Docs: man:systemd-sysv-generator(8)
            CGroup: /system.slice/ntp.service
                    └─2873 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -c /var/lib/ntp/ntp.conf.dhcp -u 109:114
    

    If NTP is not installed, install and configure it before continuing.

    If NTP is not running:

    • Verify the IP address or hostname of the NTP server in the /etc/ntp.conf file, and then
    • Reenable and start the NTP service using the systemctl [enable|start] ntp commands

    If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the NVIDIA networking repository.

    To obtain the NetQ Agent package:

    Edit the /etc/apt/sources.list file to add the repository for NetQ.

    Note that NetQ has a separate repository from Cumulus Linux.

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-4.4
    ...
    

    You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-latest repository if you want to always retrieve the latest posted version of NetQ.

    Cumulus Linux 4.4 and later includes the netq-agent package by default.

    To add the repository, uncomment or add the following line in /etc/apt/sources.list:

    cumulus@switch:~$ sudo nano /etc/apt/sources.list
    ...
    deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.4
    ...
    

    You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository if you want to always retrieve the latest posted version of NetQ.

    Add the apps3.cumulusnetworks.com authentication key to Cumulus Linux:

    cumulus@switch:~$ wget -qO - https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | sudo apt-key add -
    

    Verify NTP Is Installed and Configured

    Verify that NTP is running on the switch. The switch must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

    admin@switch:~$ sudo systemctl status ntp
    ● ntp.service - Network Time Service
         Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enabled)
         Active: active (running) since Tue 2021-06-08 14:56:16 UTC; 2min 18s ago
           Docs: man:ntpd(8)
        Process: 1444909 ExecStart=/usr/lib/ntp/ntp-systemd-wrapper (code=exited, status=0/SUCCESS)
       Main PID: 1444921 (ntpd)
          Tasks: 2 (limit: 9485)
         Memory: 1.9M
         CGroup: /system.slice/ntp.service
                 └─1444921 /usr/sbin/ntpd -p /var/run/ntpd.pid -x -u 106:112
    

    If NTP is not installed, install and configure it before continuing.

    If NTP is not running:

    • Verify the IP address or hostname of the NTP server in the /etc/sonic/config_db.json file, and then
    • Reenable and start the NTP service using the sudo config reload -n command

    Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.

    admin@switch:~$ show ntp
    MGMT_VRF_CONFIG is not present.
    synchronised to NTP server (104.194.8.227) at stratum 3
       time correct to within 2014 ms
       polling server every 64 s
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    -144.172.118.20  139.78.97.128    2 u   26   64  377   47.023  -1798.1 120.803
    +208.67.75.242   128.227.205.3    2 u   32   64  377   72.050  -1939.3  97.869
    +216.229.4.66    69.89.207.99     2 u  160   64  374   41.223  -1965.9  83.585
    *104.194.8.227   164.67.62.212    2 u   33   64  377    9.180  -1934.4  97.376
    

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the NVIDIA networking repository.

    Note that NetQ has a separate repository from SONiC.

    To obtain the NetQ Agent package:

    1. Install the wget utility so you can install the GPG keys in step 3.

      admin@switch:~$ sudo apt-get update
      admin@switch:~$ sudo apt-get install wget -y
      
    2. Edit the /etc/apt/sources.list file to add the SONiC repository:

      admin@switch:~$ sudo vi /etc/apt/sources.list
      ...
      deb https://apps3.cumulusnetworks.com/repos/deb buster netq-latest
      ...
      
    3. Add the SONiC repo key:

      admin@switch:~$ sudo wget -qO - https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | sudo apt-key add -
      

    Verify Service Package Versions

    Before you install the NetQ Agent on a Red Hat or CentOS server, make sure you install and run at least the minimum versions of the following packages:

    • iproute-3.10.0-54.el7_2.1.x86_64
    • lldpd-0.9.7-5.el7.x86_64
    • ntp-4.2.6p5-25.el7.centos.2.x86_64
    • ntpdate-4.2.6p5-25.el7.centos.2.x86_64

    Verify the Server is Running lldpd and wget

    Make sure you are running lldpd, not lldpad. CentOS does not include lldpd by default, nor does it include wget; however,the installation requires it.

    To install this package, run the following commands:

    root@rhel7:~# sudo yum -y install epel-release
    root@rhel7:~# sudo yum -y install lldpd
    root@rhel7:~# sudo systemctl enable lldpd.service
    root@rhel7:~# sudo systemctl start lldpd.service
    root@rhel7:~# sudo yum install wget
    

    Install and Configure NTP

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

      root@rhel7:~# sudo yum install ntp
      
    2. Configure the NTP server.

      1. Open the /etc/ntp.conf file in your text editor of choice.

      2. Under the Server section, specify the NTP server IP address or hostname.

    3. Enable and start the NTP service.

      root@rhel7:~# sudo systemctl enable ntp
      root@rhel7:~# sudo systemctl start ntp
      

    If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.

      root@rhel7:~# ntpq -pn
      remote           refid            st t when poll reach   delay   offset  jitter
      ==============================================================================
      +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
      +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
      2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
      \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
      

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the NVIDIA networking repository.

    To obtain the NetQ Agent package:

    1. Reference and update the local yum repository.

      root@rhel7:~# sudo rpm --import https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm.pubkey
      root@rhel7:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm-el7.repo > /etc/yum.repos.d/cumulus-host-el.repo
      
    2. Edit /etc/yum.repos.d/cumulus-host-el.repo to set the enabled=1 flag for the two NetQ repositories.

      root@rhel7:~# vi /etc/yum.repos.d/cumulus-host-el.repo
      ...
      [cumulus-arch-netq-4.0]
      name=Cumulus netq packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-4.0/$basearch
      gpgcheck=1
      enabled=1
      [cumulus-noarch-netq-4.0]
      name=Cumulus netq architecture-independent packages
      baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-4.0/noarch
      gpgcheck=1
      enabled=1
      ...
      

    Verify Service Package Versions

    Before you install the NetQ Agent on an Ubuntu server, make sure you install and run at least the minimum versions of the following packages:

    • iproute 1:4.3.0-1ubuntu3.16.04.1 all
    • iproute2 4.3.0-1ubuntu3 amd64
    • lldpd 0.7.19-1 amd64
    • ntp 1:4.2.8p4+dfsg-3ubuntu5.6 amd64

    Verify the Server is Running lldpd

    Make sure you are running lldpd, not lldpad. Ubuntu does not include lldpd by default; however, the installation requires it.

    To install this package, run the following commands:

    root@ubuntu:~# sudo apt-get update
    root@ubuntu:~# sudo apt-get install lldpd
    root@ubuntu:~# sudo systemctl enable lldpd.service
    root@ubuntu:~# sudo systemctl start lldpd.service
    

    Install and Configure Network Time Server

    If NTP is not already installed and configured, follow these steps:

    1. Install NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

      root@ubuntu:~# sudo apt-get install ntp
      
    2. Configure the network time server.

    1. Open the /etc/ntp.conf file in your text editor of choice.

    2. Under the Server section, specify the NTP server IP address or hostname.

    3. Enable and start the NTP service.

      root@ubuntu:~# sudo systemctl enable ntp
      root@ubuntu:~# sudo systemctl start ntp
      

    If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

    1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.

      root@ubuntu:~# ntpq -pn
      remote           refid            st t when poll reach   delay   offset  jitter
      ==============================================================================
      +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
      +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
      2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
      \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
      
    1. Install chrony if needed.
    root@ubuntu:~# sudo apt install chrony
    
    1. Start the chrony service.
    root@ubuntu:~# sudo /usr/local/sbin/chronyd
    
    1. Verify it installed successfully.
    root@ubuntu:~# chronyc activity
    200 OK
    8 sources online
    0 sources offline
    0 sources doing burst (return to online)
    0 sources doing burst (return to offline)
    0 sources with unknown address
    
    1. View the time servers chrony is using.
    root@ubuntu:~# chronyc sources
    210 Number of sources = 8
    MS Name/IP address         Stratum Poll Reach LastRx Last sample
    ===============================================================================
    ^+ golem.canonical.com           2   6   377    39  -1135us[-1135us] +/-   98ms
    ^* clock.xmission.com            2   6   377    41  -4641ns[ +144us] +/-   41ms
    ^+ ntp.ubuntu.net              2   7   377   106   -746us[ -573us] +/-   41ms
    ...
    

    Open the chrony.conf configuration file (by default at /etc/chrony/) and edit if needed.

    Example with individual servers specified:

    server golem.canonical.com iburst
    server clock.xmission.com iburst
    server ntp.ubuntu.com iburst
    driftfile /var/lib/chrony/drift
    makestep 1.0 3
    rtcsync
    

    Example when using a pool of servers:

    pool pool.ntp.org iburst
    driftfile /var/lib/chrony/drift
    makestep 1.0 3
    rtcsync
    
    1. View the server chrony is currently tracking.
    root@ubuntu:~# chronyc tracking
    Reference ID    : 5BBD59C7 (golem.canonical.com)
    Stratum         : 3
    Ref time (UTC)  : Mon Feb 10 14:35:18 2020
    System time     : 0.0000046340 seconds slow of NTP time
    Last offset     : -0.000123459 seconds
    RMS offset      : 0.007654410 seconds
    Frequency       : 8.342 ppm slow
    Residual freq   : -0.000 ppm
    Skew            : 26.846 ppm
    Root delay      : 0.031207654 seconds
    Root dispersion : 0.001234590 seconds
    Update interval : 115.2 seconds
    Leap status     : Normal
    

    Obtain NetQ Agent Software Package

    To install the NetQ Agent you need to install netq-agent on each server. This is available from the NVIDIA networking repository.

    To obtain the NetQ Agent package:

    1. Reference and update the local apt repository.
    root@ubuntu:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | apt-key add -
    
    1. Add the Ubuntu repository:

    Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:

    root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
    ...
    deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
    ...
    

    The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even for a major version update. If you want to keep the repository on a specific version — such as netq-4.4 — use that instead.

    Install NetQ Agent

    After completing the preparation steps, install the agent onto your switch or host.

    Cumulus Linux 4.4 and later includes the netq-agent package by default. To install the NetQ Agent on earlier versions of Cumulus Linux:

    1. Update the local apt repository, then install the NetQ software on the switch.

      cumulus@switch:~$ sudo apt-get update
      cumulus@switch:~$ sudo apt-get install netq-agent
      
    2. Verify you have the correct version of the Agent.

      cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
      

      You should see version 4.4.0 and update 40 in the results.

      • Cumulus Linux 3.7.x
        • netq-agent_4.4.0-cl3u40~1666942953.97ef4c9_armel.deb
        • netq-agent_4.4.0-cl3u40~1667493479.97ef4c9_amd64.deb
      • Cumulus Linux 4.0.0 and later
        • netq-agent_4.4.0-cl4u40~1667496902.97ef4c9d_armel.deb
        • netq-agent_4.4.0-cl4u40~1667493586.97ef4c9d_amd64.deb
      1. Restart rsyslog so it sends log files to the correct destination.

        cumulus@switch:~$ sudo systemctl restart rsyslog.service
        
      2. Continue with NetQ Agent configuration in the next section.

      To install the NetQ Agent (the following example uses Cumulus Linux but the steps are the same for SONiC):

      1. Update the local apt repository, then install the NetQ software on the switch.

        admin@switch:~$ sudo apt-get update
        admin@switch:~$ sudo apt-get install netq-agent
        
      2. Verify you have the correct version of the Agent.

        admin@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
        
      3. Restart rsyslog so it sends log files to the correct destination.

        admin@switch:~$ sudo systemctl restart rsyslog.service
        
      4. Continue with NetQ Agent configuration in the next section.

      To install the NetQ Agent:

      1. Install the Bash completion and NetQ packages on the server.

        root@rhel7:~# sudo yum -y install bash-completion
        root@rhel7:~# sudo yum install netq-agent
        
      2. Verify you have the correct version of the Agent.

        root@rhel7:~# rpm -qa | grep -i netq
        

        You should see version 4.4.0 and update 40 in the results.

        • netq-agent-4.4.0-rh7u40~1667495055.97ef4c9.x86_64.rpm
        1. Restart rsyslog so it sends log files to the correct destination.

          root@rhel7:~# sudo systemctl restart rsyslog
          
        2. Continue with NetQ Agent Configuration in the next section.

        To install the NetQ Agent:

        1. Install the software packages on the server.

          root@ubuntu:~# sudo apt-get update
          root@ubuntu:~# sudo apt-get install netq-agent
          
        2. Verify you have the correct version of the Agent.

          root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
          

          You should see version 4.4.0 and update 40 in the results.

          • netq-agent_4.4.0-ub18.04u40~1667493385.97ef4c9_amd64.deb
          1. Restart rsyslog so it sends log files to the correct destination.
          root@ubuntu:~# sudo systemctl restart rsyslog.service
          
          1. Continue with NetQ Agent Configuration in the next section.

          Configure NetQ Agent

          After you install the NetQ Agents on the switches you want to monitor, you must configure them to obtain useful and relevant data.

          The NetQ Agent is aware of and communicates through the designated VRF. If you do not specify one, it uses the default VRF (named default). If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.

          If you configure the NetQ Agent to communicate in a VRF that is not default or mgmt, the following line must be added to /etc/netq/netq.yml in the netq-agent section:

          netq-agent:
            netq_stream_address: 0.0.0.0
          

          Two methods are available for configuring a NetQ Agent:

          Configure NetQ Agents Using a Configuration File

          You can configure the NetQ Agent in the netq.yml configuration file contained in the /etc/netq/ directory.

          1. Open the netq.yml file using your text editor of choice. For example:

            sudo nano /etc/netq/netq.yml
            
          2. Locate the netq-agent section, or add it.

          3. Set the parameters for the agent as follows:

            • port: 31980 (default configuration)
            • server: IP address of the NetQ Appliance or VM where the agent should send its collected data
            • vrf: default (or one that you specify)

            Your configuration should be similar to this:

            netq-agent:
                port: 31980
                server: 127.0.0.1
                vrf: mgmt
            

          Configure NetQ Agents Using the NetQ CLI

          If you configured the NetQ CLI, you can use it to configure the NetQ Agent to send telemetry data to the NetQ Appliance or VM. To configure the NetQ CLI, refer to Install NetQ CLI.

          If you intend to use a VRF for agent communication (recommended), refer to Configure the Agent to Use VRF. If you intend to specify a port for communication, refer to Configure the Agent to Communicate over a Specific Port.

          Use the following command to configure the NetQ Agent:

          netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
          

          This example uses an IP address of 192.168.1.254 and the default port and VRF for the NetQ Appliance or VM.

          sudo netq config add agent server 192.168.1.254
          Updated agent server 192.168.1.254 vrf default. Please restart netq-agent (netq config restart agent).
          sudo netq config restart agent
          

          Configure Advanced NetQ Agent Settings

          A couple of additional options are available for configuring the NetQ Agent. If you are using VRFs, you can configure the agent to communicate over a specific VRF. You can also configure the agent to use a particular port.

          Configure the Agent to Use a VRF

          By default, NetQ uses the default VRF for communication between the NetQ Appliance or VM and NetQ Agents. While optional, NVIDIA strongly recommends that you configure NetQ Agents to communicate with the NetQ Appliance or VM only via a VRF, including a management VRF. To do so, you need to specify the VRF name when configuring the NetQ Agent. For example, if you configured the management VRF and you want the agent to communicate with the NetQ Appliance or VM over it, configure the agent like this:

          sudo netq config add agent server 192.168.1.254 vrf mgmt
          sudo netq config restart agent
          

          If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.

          Configure the Agent to Communicate over a Specific Port

          By default, NetQ uses port 31980 for communication between the NetQ Appliance or VM and NetQ Agents. If you want the NetQ Agent to communicate with the NetQ Appliance or VM via a different port, you need to specify the port number when configuring the NetQ Agent, like this:

          sudo netq config add agent server 192.168.1.254 port 7379
          sudo netq config restart agent
          

          Configure the On-switch OPTA

          On-switch OPTA functionality is an early access feature, and it does not support Flow Analysis or LCM.

          On-switch OPTA is intended for use in small NetQ Cloud deployments where a dedicated OPTA might not be necessary. If you need help assessing the correct OPTA configuration for your deployment, contact your NVIDIA sales team.

          Instead of installing a dedicated OPTA appliance, you can enable the OPTA service on every switch in your environment that will send data to the NetQ Cloud. To configure a switch for OPTA functionality, install the netq-opta package.

          sudo apt-get update
          sudo apt-get install netq-opta
          

          After the netq-opta package is installed, add your OPTA configuration key. Run the following command with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premises management configuration. For more information, see First-time Login for NetQ Cloud.

          netq config add opta config-key <config_key> [vrf <vrf_name>] [proxy-host <text-proxy-host> proxy-port <text-proxy-port>] 
          

          The VRF name should be the VRF used to communicate with the NetQ Cloud. Specifying a proxy host and port is optional. For example:

          netq config add opta config-key tHkSI2d3LmRldjMubmV0cWRldi5jdW11bHVasdf29ya3MuY29tGLsDIiwzeUpNc3BwK1IyUjVXY2p2dDdPL3JHS3ZrZ1dDUkpFY2JkMVlQOGJZUW84PTIEZGV2MzoHbmV0cWRldr vrf mgmt
          

          You can also add a proxy host separately with the following command:

          netq config add opta proxy-host <text-proxy-host> proxy-port <text-proxy-port>
          

          The final step is configuring the local NetQ Agent on the switch to connect to the local OPTA service. Configure the agent on the switch to connect to localhost with the following command:

          netq config add agent server localhost vrf mgmt
          

          Install NetQ CLI

          Installing the NetQ CLI on your NetQ Appliances, VMs, switches, or hosts gives you access to new features and bug fixes, and allows you to manage your network from multiple points in the network.

          After installing the NetQ software and agent on each switch you want to monitor, you can also install the NetQ CLI on switches running:

          If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the NetQ repository.

          Prepare for NetQ CLI Installation on a RHEL, CentOS, or Ubuntu Server

          For servers running RHEL 7, CentOS or Ubuntu OS, you need to:

          These steps are not required for Cumulus Linux or SONiC.

          Verify Service Package Versions

          • iproute-3.10.0-54.el7_2.1.x86_64
          • lldpd-0.9.7-5.el7.x86_64
          • ntp-4.2.6p5-25.el7.centos.2.x86_64
          • ntpdate-4.2.6p5-25.el7.centos.2.x86_64
          • iproute 1:4.3.0-1ubuntu3.16.04.1 all
          • iproute2 4.3.0-1ubuntu3 amd64
          • lldpd 0.7.19-1 amd64
          • ntp 1:4.2.8p4+dfsg-3ubuntu5.6 amd64

          Verify What CentOS and Ubuntu Are Running

          For CentOS and Ubuntu, make sure you are running lldpd, not lldpad. CentOS and Ubuntu do not include lldpd by default, even though the installation requires it. In addition, CentOS does not include wget, even though the installation requires it.

          To install this package, run the following commands:

          root@centos:~# sudo yum -y install epel-release
          root@centos:~# sudo yum -y install lldpd
          root@centos:~# sudo systemctl enable lldpd.service
          root@centos:~# sudo systemctl start lldpd.service
          root@centos:~# sudo yum install wget
          

          To install lldpd, run the following commands:

          root@ubuntu:~# sudo apt-get update
          root@ubuntu:~# sudo apt-get install lldpd
          root@ubuntu:~# sudo systemctl enable lldpd.service
          root@ubuntu:~# sudo systemctl start lldpd.service
          

          Install and Configure NTP

          If NTP is not already installed and configured, follow these steps:

          1. Install NTP on the server. Servers must be in time synchronization with the NetQ Appliance or VM to enable useful statistical analysis.

            root@rhel7:~# sudo yum install ntp
            
          2. Configure the NTP server.

            1. Open the /etc/ntp.conf file in your text editor of choice.

            2. Under the Server section, specify the NTP server IP address or hostname.

          3. Enable and start the NTP service.

            root@rhel7:~# sudo systemctl enable ntp
            root@rhel7:~# sudo systemctl start ntp
            

          If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

          1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.

            root@rhel7:~# ntpq -pn
            remote           refid            st t when poll reach   delay   offset  jitter
            ==============================================================================
            +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
            +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
            2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
            \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
            
          1. Install NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.

            root@ubuntu:~# sudo apt-get install ntp
            
          2. Configure the network time server.

          1. Open the /etc/ntp.conf file in your text editor of choice.

          2. Under the Server section, specify the NTP server IP address or hostname.

          3. Enable and start the NTP service.

            root@ubuntu:~# sudo systemctl enable ntp
            root@ubuntu:~# sudo systemctl start ntp
            

          If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.

          1. Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.

            root@ubuntu:~# ntpq -pn
            remote           refid            st t when poll reach   delay   offset  jitter
            ==============================================================================
            +173.255.206.154 132.163.96.3     2 u   86  128  377   41.354    2.834   0.602
            +12.167.151.2    198.148.79.209   3 u  103  128  377   13.395   -4.025   0.198
            2a00:7600::41    .STEP.          16 u    - 1024    0    0.000    0.000   0.000
            \*129.250.35.250 249.224.99.213   2 u  101  128  377   14.588   -0.299   0.243
            

          1. Install chrony if needed.

            root@ubuntu:~# sudo apt install chrony
            
          2. Start the chrony service.

            root@ubuntu:~# sudo /usr/local/sbin/chronyd
            
          3. Verify it installed successfully.

            root@ubuntu:~# chronyc activity
            200 OK
            8 sources online
            0 sources offline
            0 sources doing burst (return to online)
            0 sources doing burst (return to offline)
            0 sources with unknown address
            
          4. View the time servers chrony is using.

            root@ubuntu:~# chronyc sources
            210 Number of sources = 8
            

            MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^+ golem.canonical.com 2 6 377 39 -1135us[-1135us] +/- 98ms ^* clock.xmission.com 2 6 377 41 -4641ns[ +144us] +/- 41ms ^+ ntp.ubuntu.net 2 7 377 106 -746us[ -573us] +/- 41ms …

            Open the chrony.conf configuration file (by default at /etc/chrony/) and edit if needed.

            Example with individual servers specified:

            server golem.canonical.com iburst
            server clock.xmission.com iburst
            server ntp.ubuntu.com iburst
            driftfile /var/lib/chrony/drift
            makestep 1.0 3
            rtcsync
            

            Example when using a pool of servers:

            pool pool.ntp.org iburst
            driftfile /var/lib/chrony/drift
            makestep 1.0 3
            rtcsync
            
          5. View the server chrony is currently tracking.

            root@ubuntu:~# chronyc tracking
            Reference ID    : 5BBD59C7 (golem.canonical.com)
            Stratum         : 3
            Ref time (UTC)  : Mon Feb 10 14:35:18 2020
            System time     : 0.0000046340 seconds slow of NTP time
            Last offset     : -0.000123459 seconds
            RMS offset      : 0.007654410 seconds
            Frequency       : 8.342 ppm slow
            Residual freq   : -0.000 ppm
            Skew            : 26.846 ppm
            Root delay      : 0.031207654 seconds
            Root dispersion : 0.001234590 seconds
            Update interval : 115.2 seconds
            Leap status     : Normal
            

          Get the NetQ CLI Software Package for Ubuntu

          To install the NetQ CLI on an Ubuntu server, you need to install netq-apps on each Ubuntu server. This is available from the NetQ repository.

          To get the NetQ CLI package:

          1. Reference and update the local apt repository.

            root@ubuntu:~# sudo wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-deb.pubkey | apt-key add -
            
          2. Add the Ubuntu repository:

            Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:

            root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
            ...
            deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
            ...
            

            The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even for a major version update. If you want to keep the repository on a specific version — such as netq-4.4 — use that instead.

          Install NetQ CLI

          Follow these steps to install the NetQ CLI on a switch or host.

          To install the NetQ CLI you need to install netq-apps on each switch. This is available from the NVIDIA networking repository.

          Cumulus Linux 4.4 and later includes the netq-apps package by default.

          If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.

          To obtain the NetQ CLI package:

          Edit the /etc/apt/sources.list file to add the repository for NetQ.

          Note that NetQ has a separate repository from Cumulus Linux.

          cumulus@switch:~$ sudo nano /etc/apt/sources.list
          ...
          deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-4.4
          ...
          

          You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository to always retrieve the latest version of NetQ.

          Cumulus Linux 4.4 and later includes the netq-apps package by default.

          To add the repository, uncomment or add the following line in /etc/apt/sources.list:

          cumulus@switch:~$ sudo nano /etc/apt/sources.list
          ...
          deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.4
          ...
          

          You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository if you want to always retrieve the latest posted version of NetQ.

          1. Update the local apt repository and install the software on the switch.

            cumulus@switch:~$ sudo apt-get update
            cumulus@switch:~$ sudo apt-get install netq-apps
            
          2. Verify you have the correct version of the CLI.

            cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
            

          You should see version 4.4.0 and update 40 in the results. For example:

          • Cumulus Linux 3.7
            • netq-apps_4.4.0-cl3u40~1666942953.97ef4c9_armel.deb
            • netq-apps_4.4.0-cl3u40~1667493479.97ef4c9_amd64.deb
          • Cumulus Linux 4.0.0 and later
            • netq-apps_4.4.0-cl4u40~1667496902.97ef4c9d_armel.deb
            • netq-apps_4.4.0-cl4u40~1667493586.97ef4c9d_amd64.deb

          1. Continue with NetQ CLI configuration in the next section.

          To install the NetQ CLI you need to install netq-apps on each switch. This is available from the NVIDIA networking repository.

          If your network uses a proxy server for external connections, you should first configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.

          To obtain the NetQ CLI package:

          1. Edit the /etc/apt/sources.list file to add the repository for NetQ.

            admin@switch:~$ sudo nano /etc/apt/sources.list
            ...
            deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb buster netq-4.4
            ...
            
          2. Update the local apt repository and install the software on the switch.

            admin@switch:~$ sudo apt-get update
            admin@switch:~$ sudo apt-get install netq-apps
            
          3. Verify you have the correct version of the CLI.

            admin@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
            

            You should see version 4.4.0 and update 40 in the results. For example:

            • netq-apps_4.4.0-deb10u40~1667495784.97ef4c9d_amd64.deb
          4. Continue with NetQ CLI configuration in the next section.

          1. Reference and update the local yum repository and key.

            root@rhel7:~# rpm --import https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm.pubkey
            root@rhel7:~# wget -O- https://apps3.cumulusnetworks.com/setup/cumulus-apps-rpm-el7.repo > /etc/yum.repos.d/cumulus-host-el.repo
            
          2. Edit /etc/yum.repos.d/cumulus-host-el.repo to set the enabled=1 flag for the two NetQ repositories.

            root@rhel7:~# vi /etc/yum.repos.d/cumulus-host-el.repo
            ...
            [cumulus-arch-netq-latest]
            name=Cumulus netq packages
            baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-latest/$basearch
            gpgcheck=1
            enabled=1
            [cumulus-noarch-netq-latest]
            name=Cumulus netq architecture-independent packages
            baseurl=https://apps3.cumulusnetworks.com/repos/rpm/el/7/netq-latest/noarch
            gpgcheck=1
            enabled=1
            ...
            
          3. Install the Bash completion and CLI software on the server.

            root@rhel7:~# sudo yum -y install bash-completion
            root@rhel7:~# sudo yum install netq-apps
            
          4. Verify you have the correct version of the CLI.

            root@rhel7:~# rpm -q -netq-apps
            
          1. Continue with the next section.
          1. Install the CLI software on the server.

            root@ubuntu:~# sudo apt-get update
            root@ubuntu:~# sudo apt-get install netq-apps
            
          2. Verify you have the correct version of the CLI.

            root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
            

          You should see version 4.4.0 and update 40 in the results. For example:

          • netq-apps_4.4.0-ub18.04u40~1667493385.97ef4c9_amd64.deb

          1. Continue with NetQ CLI configuration in the next section.

          Configure the NetQ CLI

          By default, you do not configure the NetQ CLI during the NetQ installation. The configuration resides in the /etc/netq/netq.yml file. Until the CLI is configured on a device, you can only run netq config and netq help commands, and you must use sudo to run them.

          At minimum, you need to configure the NetQ CLI and NetQ Agent to communicate with the telemetry server. To do so, configure the NetQ Agent and the NetQ CLI so that they are running in the VRF where the routing tables have connectivity to the telemetry server (typically the management VRF).

          To access and configure the CLI for your on-premises NetQ deployment, you must generate AuthKeys. You’ll need your username and password to generate them. These keys provide authorized access (access key) and user authentication (secret key).

          To generate AuthKeys:

          1. Enter your on-premises NetQ appliance hostname or IP address into your browser to open the NetQ UI login page.

          2. Enter your username and password.

          3. Expand the Menu, and under Admin, select Management.

          1. Select Manage on the User Accounts card.

          2. Select your user and click above the table.

          3. Copy these keys to a safe place. Select Copy to obtain the CLI configuration command to use on your devices.

          The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.

          You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:

          • store the file wherever you like, for example in /home/cumulus/ or /etc/netq
          • name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml

          The file must have the following format:

          access-key: <user-access-key-value-here>
          secret-key: <user-secret-key-value-here>
          

          1. Insert the AuthKeys onto your device to configure the CLI. Alternately, use the following command.

            netq config add cli server <text-gateway-dest> [access-key <text-access-key> secret-key <text-secret-key> premises <text-premises-name> | cli-keys-file <text-key-file> premises <text-premises-name>] [vrf <text-vrf-name>] [port <text-gateway-port>]
            
          2. Restart the CLI to activate the configuration.

            The following example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Replace the key values with your generated keys if you are using this example on your server.

            sudo netq config add cli server netqhostname.labtest.net access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
            Updated cli server netqhostname.labtest.net vrf default port 443. Please restart netqd (netq config restart cli)
            
            sudo netq config restart cli
            Restarting NetQ CLI... Success!
            

            This example uses an optional keys file. Replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.

            sudo netq config add cli server netqhostname.labtest.net cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
            Updated cli server netqhostname.labtest.net vrf default port 443. Please restart netqd (netq config restart cli)
            
            sudo netq config restart cli
            Restarting NetQ CLI... Success!
            

          If you have multiple premises and want to query data from a different premises than you originally configured, rerun the netq config add cli server command with the desired premises name. You can only view the data for one premises at a time with the CLI.

          To access and configure the CLI for your on-premises NetQ deployment, you must generate AuthKeys. You’ll need your username and password to generate them. These keys provide authorized access (access key) and user authentication (secret key). Your credentials and NetQ Cloud addresses were obtained during first login to the NetQ Cloud and premises activation.

          To generate AuthKeys:

          1. Enter netq.nvidia.com into your browser to open the NetQ UI login page.

          2. Enter your username and password.

          3. Expand the Menu, and under Admin, select Management.

          1. Select Manage on the User Accounts card.

          2. Select your user and click above the table.

          3. Copy these keys to a safe place. Select Copy to obtain the CLI configuration command to use on your devices.

          The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.

          You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:

          • store the file wherever you like, for example in /home/cumulus/ or /etc/netq
          • name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml

          The file must have the following format:

          access-key: <user-access-key-value-here>
          secret-key: <user-secret-key-value-here>
          

          1. Insert the AuthKeys onto your device to configure the CLI. Alternately, use the following command.

            netq config add cli server <text-gateway-dest> [access-key <text-access-key> secret-key <text-secret-key> premises <text-premises-name> | cli-keys-file <text-key-file> premises <text-premises-name>] [vrf <text-vrf-name>] [port <text-gateway-port>]
            
          2. Restart the CLI to activate the configuration.

            The following example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Replace the key values with your generated keys if you are using this example on your server.

            sudo netq config add cli server api.netq.cumulusnetworks.com access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
            Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
            Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
            
            sudo netq config restart cli
            Restarting NetQ CLI... Success!
            

            The following example uses an optional keys file. Replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.

            sudo netq config add cli server api.netq.cumulusnetworks.com cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
            Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
            Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
            
            sudo netq config restart cli
            Restarting NetQ CLI... Success!
            

          If you have multiple premises and want to query data from a different premises than you originally configured, rerun the netq config add cli server command with the desired premises name. You can only view the data for one premises at a time with the CLI.

          Add More Nodes to Your Server Cluster

          You can add additional nodes to your server cluster on-premises and cloud deployments using the CLI:

          Run the following CLI command to add a new worker node for on-premises deployments:

          netq install cluster add-worker <text-worker-01>

          Run the following CLI command to add a new worker node for cloud deployments:

          netq install opta cluster add-worker <text-worker-01>

          Install a Custom Signed Certificate

          The NetQ UI ships with a self-signed certificate that is sufficient for non-production environments or cloud deployments. For on-premises deployments, however, you receive a warning from your browser that this default certificate is not trusted when you first log in to the NetQ UI. You can avoid this by installing your own signed certificate.

          If you already have a certificate installed and want to change or update it, run the kubectl delete secret netq-gui-ingress-tls [name] --namespace default command.

          You need the following items to perform the certificate installation:

          Install a Certificate using the NetQ CLI

          1. Log in to the NetQ On-premises Appliance or VM via SSH and copy your certificate and key file there.

          2. Generate a Kubernetes secret called netq-gui-ingress-tls.

            cumulus@netq-ts:~$ kubectl create secret tls netq-gui-ingress-tls \
                --namespace default \
                --key <name of your key file>.key \
                --cert <name of your cert file>.crt
            
          3. Verify that you created the secret successfully.

            cumulus@netq-ts:~$ kubectl get secret
            
            NAME                               TYPE                                  DATA   AGE
            netq-gui-ingress-tls               kubernetes.io/tls                     2      5s
            
          4. Update the ingress rule file to install self-signed certificates.

            1. Create a new file called ingress.yaml.

            2. Copy and add this content to the file.

              apiVersion: extensions/v1beta1
              kind: Ingress
              metadata:
                annotations:
                  kubernetes.io/ingress.class: "ingress-nginx"
                  nginx.ingress.kubernetes.io/ssl-redirect: "true"
                  nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
                  nginx.ingress.kubernetes.io/proxy-connect-timeout: "3600"
                  nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
                  nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
                  nginx.ingress.kubernetes.io/proxy-body-size: 10g
                  nginx.ingress.kubernetes.io/proxy-request-buffering: "off"
                name: netq-gui-ingress-external
                namespace: default
              spec:
                rules:
                - host: <your-hostname>
                  http:
                    paths:
                    - backend:
                        serviceName: netq-gui
                        servicePort: 80
                tls:
                - hosts:
                  - <your-hostname>
                  secretName: netq-gui-ingress-tls
              
            3. Replace <your-hostname> with the FQDN of the NetQ On-premises Appliance or VM.

          5. Apply the new rule.

            cumulus@netq-ts:~$ kubectl apply -f ingress.yaml
            ingress.extensions/netq-gui-ingress-external configured
            

            A message like the one above appears if your ingress rule is successfully configured.

          6. Configure the NetQ API to use the new certificate.

            Edit the netq-swagger-ingress-external service:

            kubectl edit ingress netq-swagger-ingress-external
            

            Add the tls: section in the spec: stanza, referencing your configured hostname and the netq-gui-ingress-tls secretName:

            spec:
            rules:
            - host: <hostname>
              http:
              paths:
              - backend:
                serviceName: swagger-ui
                servicePort: 8080
                path: /swagger(/|$)(.*)
            tls:
            - hosts:
              - <hostname>
              secretName: netq-gui-ingress-tls
            

            After saving your changes, delete the current swagger-ui pod to restart the service:

            cumulus@netq-ts:~$ kubectl delete pod -l app=swagger-ui
            pod "swagger-ui-deploy-69cfff7b45-cj6r6" deleted
            

          Your custom certificate should now be working. Verify this by opening the NetQ UI at https://<your-hostname-or-ipaddr> in your browser.

          Update Cloud Activation Key

          NVIDIA provides a cloud activation key when you set up your premises. You use the cloud activation key (called the config-key) to access the cloud services. Note that these authorization keys are different from the ones you use to configure the CLI.

          On occasion, you might want to update your cloud service activation key—for example, if you mistyped the key during installation and now your existing key does not work, or you received a new key for your premises from NVIDIA.

          Update the activation key using the NetQ CLI:

          Run the following command on your standalone or master NetQ Cloud Appliance or VM replacing text-opta-key with your new key.

          cumulus@<hostname>:~$ netq install standalone activate-job config-key <text-opta-key>
          

          Upgrade NetQ

          This section describes how to upgrade from your current installation to NetQ 4.4. Refer to the release notes before you upgrade.

          You must upgrade your NetQ On-premises or Cloud Appliances or virtual machines (VMs). While there is some backwards compatibility with the previous NetQ release for any version, upgrading NetQ Agents is always recommended. If you want access to new and updated commands, you can upgrade the CLI on your physical servers or VMs, and monitored switches and hosts as well.

          To complete the upgrade for either an on-premises or a cloud deployment:

          Upgrade NetQ Appliances and Virtual Machines

          The first step in upgrading your NetQ installation to NetQ 4.4 is upgrading your NetQ appliances or VMs.

          Before You Upgrade

          Back up NetQ Data

          This is an optional step for on-premises deployments. Refer to Back Up and Restore NetQ. NetQ Cloud Appliances and VMs create backups automatically.

          Download Software and Update Debian Packages

          1. Download the relevant software.

            1. On the NVIDIA Application Hub, log in to your account.
            2. Select NVIDIA Licensing Portal.
            3. Select Software Downloads from the menu.
            4. Click Product Family and select NetQ.
            5. Select the relevant software for your hypervisor:
              If you are upgrading NetQ Platform software for a NetQ On-premises Appliance or VM, select NetQ SW 4.4 Appliance to download the NetQ-4.4.0.tgz file. If you are upgrading NetQ software for a NetQ Cloud Appliance or VM, select NetQ SW 4.4 Appliance Cloud to download the NetQ-4.4.0-opta.tgz file.
            6. If prompted, agree to the license agreement and proceed with the download.

            For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.


            For NVIDIA employees, download NetQ directly from the NVIDIA Licensing Portal.

          2. Copy the file to the /mnt/installables/ directory on your appliance or VM.

          3. Update /etc/apt/sources.list.d/cumulus-netq.list to netq-4.4 as follows:

            cat /etc/apt/sources.list.d/cumulus-netq.list
            deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-4.4
            
          4. Update the NetQ debian packages.

            cumulus@<hostname>:~$ sudo apt-get update
            Get:1 https://apps3.cumulusnetworks.com/repos/deb bionic InRelease [13.8 kB]
            Get:2 https://apps3.cumulusnetworks.com/repos/deb bionic/netq-4.4 amd64 Packages [758 B]
            Hit:3 http://archive.ubuntu.com/ubuntu bionic InRelease
            Get:4 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
            Get:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
            ...
            Get:24 http://archive.ubuntu.com/ubuntu bionic-backports/universe Translation-en [1900 B]
            Fetched 4651 kB in 3s (1605 kB/s)
            Reading package lists... Done
            
            cumulus@<hostname>:~$ sudo apt-get install -y netq-agent netq-apps
            Reading package lists... Done
            Building dependency tree
            Reading state information... Done
            ...
            The following NEW packages will be installed:
            netq-agent netq-apps
            ...
            Fetched 39.8 MB in 3s (13.5 MB/s)
            ...
            Unpacking netq-agent (4.4.0-ub18.04u40~1667493385.97ef4c9) ...
            ...
            Unpacking netq-apps (4.4.0-ub18.04u40~1667493385.97ef4c9) ...
            Setting up netq-apps (4.4.0-ub18.04u40~1667493385.97ef4c9) ...
            Setting up netq-agent (4.4.0-ub18.04u40~1667493385.97ef4c9) ...
            Processing triggers for rsyslog (8.32.0-1ubuntu4) ...
            Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
            

          Run the Upgrade

          Perform the following steps using the cumulus user account.

          Pre-installation Checks

          Verify the following items before upgrading NetQ. For cluster deployments, verify steps 1 and 3 on all nodes in the cluster:

          1. Check if enough disk space is available before you proceed with the upgrade:
          cumulus@netq-appliance:~$ df -h /
          Filesystem      Size  Used Avail Use% Mounted on
          /dev/sda1       248G   70G  179G  28% /
          cumulus@netq-appliance:~$
          

          The recommended Use% to proceed with installation is under 70%. You can delete previous software tarballs in the /mnt/installables/ directory to regain some space. If you can not bring disk space to under 70% usage, contact the NVIDIA support team.

          1. Run the netq show opta-health command and check that all pods are in the READY state. If not, contact the NVIDIA support team.

          2. Check if the certificates have expired:

          cumulus@netq-appliance:~$ sudo grep client-certificate-data /etc/kubernetes/kubelet.conf | cut -d: -f2 | xargs | base64 -d | openssl x509 -dates -noout | grep notAfter | cut -f2 -d=
          Dec 18 17:53:16 2021 GMT
          cumulus@netq-appliance:~$
          

          If the date in the above output is in the past, run the following commands before proceeding with the upgrade:

          sudo cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.bak
          sudo sed -i 's/client-certificate-data.*/client-certificate: \/var\/lib\/kubelet\/pki\/kubelet-client-current.pem/g' /etc/kubernetes/kubelet.conf
          sudo sed -i 's/client-key.*/client-key: \/var\/lib\/kubelet\/pki\/kubelet-client-current.pem/g' /etc/kubernetes/kubelet.conf
          sudo systemctl restart kubelet
          

          Check if the kubelet process is running with the sudo systemctl status kubelet command before proceeding with the upgrade.

          If any issue occurs, contact the NVIDIA Support team.

          Upgrade Using the NetQ CLI

          After completing the preparation steps, upgrade your NetQ On-premises, Cloud Appliances, or VMs using the NetQ CLI.

          To upgrade your NetQ software:

          1. Run the appropriate netq upgrade command.
          netq upgrade bundle /mnt/installables/NetQ-4.4.0.tgz
          
          netq upgrade bundle /mnt/installables/NetQ-4.4.0-opta.tgz
          
          1. After the upgrade completes, confirm the upgrade was successful.

            cumulus@<hostname>:~$ cat /etc/app-release
            BOOTSTRAP_VERSION=4.4.0
            APPLIANCE_MANIFEST_HASH=d552ed2f70b56e31aad8f35cab9383af4b2fe61abe55939b19b491b4e480d737
            APPLIANCE_VERSION=4.4.0
            

          Upgrade NetQ Agents

          Upgrading the NetQ Agents is optional, but recommended.

          Upgrade NetQ Agent

          To upgrade the NetQ Agent:

          1. Log in to your switch or host.

          2. Update and install the new NetQ Debian package.

            sudo apt-get update
            sudo apt-get install -y netq-agent
            
            sudo yum update
            sudo yum install netq-agent
            
          3. Restart the NetQ Agent with the following command. The NetQ CLI must be installed for the command to run successfully.

            netq config restart agent
            

          Refer to Install NetQ Agents to complete the upgrade.

          Verify NetQ Agent Version

          You can verify the version of the agent software you have deployed as described in the following sections.

          Run the following command to view the NetQ Agent version.

          cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
          

          You should see version 4.4.0 and update 40 in the results.

          • Cumulus Linux 3.7.x
            • netq-agent_4.4.0-cl3u40~1666942953.97ef4c9_armel.deb
            • netq-agent_4.4.0-cl3u40~1667493479.97ef4c9_amd64.deb
          • Cumulus Linux 4.0.0 and later
            • netq-agent_4.4.0-cl4u40~1667496902.97ef4c9d_armel.deb
            • netq-agent_4.4.0-cl4u40~1667493586.97ef4c9d_amd64.deb

          root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-agent
          

          You should see version 4.4.0 and update 40 in the results.

          • netq-agent_4.4.0-ub18.04u40~1667493385.97ef4c9_amd64.deb

          root@rhel7:~# rpm -q -netq-agent
          

          You should see version 4.4.0 and update 40 in the results.

          • netq-agent-4.4.0-rh7u40~1667495055.97ef4c9.x86_64.rpm

          If you see an older version, upgrade the NetQ Agent, as described above.

          Upgrade NetQ CLI

          Upgrading the NetQ CLI is optional, but recommended.

          To upgrade the NetQ CLI:

          1. Log in to your switch or host.

          2. Update and install the new NetQ Debian package.

            sudo apt-get update
            sudo apt-get install -y netq-apps
            
            sudo yum update
            sudo yum install netq-apps
            
          3. Restart the CLI.

            netq config restart cli
            

          To complete the upgrade, refer to Configure the NetQ CLI.

          Accounts and Roles

          NetQ accounts are assigned one of two roles: admin or user.

          Accounts with admin privileges can perform the same actions as user accounts. Additionally, admins can access a management dashboard in the UI. From this dashboard, admins can:

          The following image displays the management dashboard. Accounts with user privileges cannot perform the functions described above and do not have access to the management dashboard.

          netq management dashboard

          Add and Manage Accounts

          Sign in to NetQ as an admin to view and manage accounts. If you want to change individual preferences, visit Set User Preferences.

          Navigate to the NetQ management dashboard to complete the tasks outlined in this section. To get there, expand the Menu on the NetQ dashboard and under Admin, select Management.

          Add an Account

          This section outlines the steps to add a local user account. To add an LDAP account, refer to LDAP Authentication.

          To create a new account:

          1. On the User Accounts card, select Manage to open a table listing all accounts.

          2. Above the table, select add to add an account.

            card with empty fields prompting admin to create an account
          3. Enter the fields and select Save.

            Be especially careful entering the email address as you cannot change it once you save the account. If you save a mistyped email address, you must delete the account and create a new one.

          Edit an Account

          As an admin, you can:

          You cannot edit the email address associated with an account, because this is the identifier the system uses for authentication. If you need to change an email address, delete the account and create a new one.

          To edit an account:

          1. On the User Accounts card, select Manage to open a table listing all accounts.

          2. Select the account you’d like to edit. Above the table, click edit to edit the account’s information.

          Reset an Admin Password

          If your account is assigned an admin role, reset your password by restoring the default password, then changing the password:

          1. Run the following command on your on-premises appliance CLI:
          kubectl exec $(kubectl get pod -oname -l app=cassandra) -- cqlsh -e "INSERT INTO master.user(id,  cust_id,  first_name,  last_name,  password,     access_key,  role,  email,  is_ldap_user,  is_active,  terms_of_use_accepted,  enable_alarm_notifications,  default_workbench,  preferences,  creation_time,  last_login,  reset_password)     VALUES(  'admin',  0,  'Admin',  '',  '009413d86fd42592e0910bb2146815deaceaadf3a4667b728463c4bc170a6511',     null, 'admin',  null,  false,  true,  true,  true,  { workspace_id : 'DEFAULT', workbench_id : 'DEFAULT' },  '{}',  toUnixTimestamp(now()),  toUnixTimestamp(now()),  true )"
          
          1. Log in to the NetQ UI with the default username and password: admin, admin. After logging in, you will be prompted to change the password.

          To reset a password for cloud deployments:

          1. Enter https://netq.nvidia.com in your browser to open the login page.

          2. Click Forgot Password? and enter an email address. Look for a message with the subject NetQ Password Reset Link from netq-sre@cumulusnetworks.com.

          3. Select the link in the email and follow the instructions to create a new password.

          Delete an Account

          To delete one or more accounts:

          1. On the User Accounts card, select Manage to open a table listing all accounts.

          2. Select one or more accounts. Above the table, click delete to delete the selected account(s).

          View Account Activity

          Administrators can view account activity in the activity log.

          To view the log, expand the menu Menu on the NetQ dashboard and select Management. Under Admin select Activity Log to open a table listing account activity. Use the controls above the table to filter or export the data.

          activity log table

          Manage Login Policies

          Administrators can configure a session expiration time and the number of times users can refresh before requiring them to log in again to NetQ.

          To configure these login policies:

          1. On the Login Management card, select Manage.

          2. Select how long an account can be logged in before requiring a user to log in again:

          3. Click Update to save the changes.

            The Login Management card reflects the updated configuration.

          Premises Management

          Managing premises involves renaming existing premises or creating multiple premises.

          Configure Multiple Premises

          The NetQ management dashboard lets you configure a single NetQ UI and CLI for monitoring data from multiple premises. This mean you do not need to log in to each premises to view the data.

          There are two ways to implement a multi-site, on-premises deployment: either as a full deployment at each premises or as a full deployment at the primary site with a smaller deployment at secondary sites.

          Full NetQ Deployment at Each Premises
          In this implementation, there is a NetQ appliance or VM running the NetQ Platform software with a database. Each premises operates independently, with its own NetQ UI and CLI. The NetQ appliance or VM at one of the deployments acts as the primary premises for the premises in the other deployments. A list of these secondary premises is stored with the primary deployment.

          Full NetQ Deployment at Primary Site and Smaller Deployment at Secondary Sites
          In this implementation, there is a NetQ appliance or VM at one of the deployments acting as the primary premises for the premises in the other deployments. The primary premises runs the NetQ Platform software (including the NetQ UI and CLI) and houses the database. All other deployments are secondary premises; they run the NetQ Controller software and send their data to the primary premises for storage and processing. A list of these secondary premises is stored with the primary deployment.

          After the multiple premises are configured, you can view this list of premises in the NetQ UI at the primary premises, change the name of premises on the list, and delete premises from the list.

          To configure secondary premises so that you can view their data using the primary site NetQ UI, follow the instructions for the relevant deployment type of the secondary premises.

          In this deployment model, each NetQ deployment can be installed separately. The data is stored and can be viewed from the NetQ UI at each premises.

          To configure a these premises so that their data can be viewed from one premises:

          1. In the workbench header, select the Premises dropdown.

          2. Select Manage Premises, then External Premises.

          1. Select Add External Premises.
          1. Enter the IP address for the API gateway on the NetQ appliance or VM for one of the secondary premises.

          2. Enter the access credentials for this host then click Next.

          3. Select the premises you want to connect then click Finish.

          1. Add additional secondary premises by clicking .

          In this deployment model, the data is stored and can be viewed only from the NetQ UI at the primary premises.

          The primary NetQ premises must be installed and operational before the secondary premises can be added.

          1. In the workbench header, select the Premises dropdown.

          2. Click Manage Premises. Your primary premises (OPID0) is shown by default.

          3. Click (Add Premises).

          1. Enter the name of one of the secondary premises you want to add, then click Done.
          1. Select the premises you just created.

          2. Click to generate a configuration key.

          1. Click Copy and save the key to a safe place, or click e-mail to send it to yourself or other administrator as appropriate. Then click Done

          Rename a Premises

          To rename an existing premises:

          1. In the workbench header, select the Premises dropdown, then Manage premises.

          2. Select a premises to rename, then click Edit.

          3. Enter the new name for the premises, then click Done.

          System Server Information

          To view the physical server or VM configuration:

          1. Click Main Menu Menu.

          2. Under Admin, select Management.

          3. Locate the System Server Info card:

            system server info card displaying appliance version, IP address, OS version, and NetQ version

            If no data is present on this card, it is likely that the NetQ Agent on your server or VM is not running properly or the underlying streaming services are impaired.

          Back Up and Restore NetQ

          Back up your NetQ data according to your company policy. The following sections describe how to back up and restore your NetQ data for the NetQ On-premises Appliance and VMs.

          These procedures do not apply to your NetQ Cloud Appliance or VM. The NetQ cloud service handles data backups automatically.

          Back Up Your NetQ Data

          NetQ stores its data in a Cassandra database. You perform backups by running scripts provided with the software and located in the /usr/sbin directory. When you run a backup, it creates a single tar file named netq_master_snapshot_<timestamp>.tar.gz on a local drive that you specify. NetQ supports one backup file and includes the entire set of data tables. A new backup replaces the previous backup.

          If you select the rollback option during the lifecycle management upgrade process (the default behavior), LCM automatically creates a backup.

          To manually create a backup:

          1. Run the backup script to create a backup file in /opt/<backup-directory>. Replace backup-directory with the name of the directory you want to use for the backup file.

            cumulus@netq-appliance:~$ sudo /usr/sbin/backuprestore.sh --backup --localdir /opt/<backup-directory>
            

            You can abbreviate the backup and localdir options of this command to -b and -l to reduce typing. If the backup directory identified does not already exist, the script creates the directory during the backup process.

            This is a sample of what you see as the script is running:

            [Fri 26 Jul 2019 02:35:35 PM UTC] - Received Inputs for backup ...
            [Fri 26 Jul 2019 02:35:36 PM UTC] - Able to find cassandra pod: cassandra-0
            [Fri 26 Jul 2019 02:35:36 PM UTC] - Continuing with the procedure ...
            [Fri 26 Jul 2019 02:35:36 PM UTC] - Removing the stale backup directory from cassandra pod...
            [Fri 26 Jul 2019 02:35:36 PM UTC] - Able to successfully cleanup up /opt/backuprestore from cassandra pod ...
            [Fri 26 Jul 2019 02:35:36 PM UTC] - Copying the backup script to cassandra pod ....
            /opt/backuprestore/createbackup.sh: line 1: cript: command not found
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to exeute /opt/backuprestore/createbackup.sh script on cassandra pod
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Creating local directory:/tmp/backuprestore/ ...  
            Directory /tmp/backuprestore/ already exists..cleaning up
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to copy backup from cassandra pod  to local directory:/tmp/backuprestore/ ...
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Validate the presence of backup file in directory:/tmp/backuprestore/
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Able to find backup file:netq_master_snapshot_2019-07-26_14_35_37_UTC.tar.gz
            [Fri 26 Jul 2019 02:35:48 PM UTC] - Backup finished successfully!
            
          2. Verify the backup file creation was successful.

            cumulus@netq-appliance:~$ cd /opt/<backup-directory>
            cumulus@netq-appliance:~/opt/<backup-directory># ls
            netq_master_snapshot_2019-06-04_07_24_50_UTC.tar.gz
            

          To create a scheduled backup, add sudo /usr/sbin/backuprestore.sh --backup --localdir /opt/<backup-directory> to an existing cron job, or create a new one.

          Restore Your NetQ Data

          Restore NetQ data with the backup file you created in the steps above. You can restore your instance to the same NetQ Platform or NetQ Appliance or to a new platform or appliance. You do not need to stop the server where the backup file resides to perform the restoration, but logins to the NetQ UI fail during the restoration process. The restore option of the backup script copies the data from the backup file to the database, decompresses it, verifies the restoration, and starts all necessary services. You should not see any data loss as a result of a restore operation.

          To restore NetQ on the same hardware where the backup file resides:

          Run the restore script. Replace backup-directory with the name of the directory where the backup file resides.

          cumulus@netq-appliance:~$ sudo /usr/sbin/backuprestore.sh --restore --localdir /opt/<backup-directory>
          

          You can abbreviate the restore and localdir options of this command to -r and -l to reduce typing.

          This is a sample of what you see while the script is running:

          [Fri 26 Jul 2019 02:37:49 PM UTC] - Received Inputs for restore ...
          WARNING: Restore procedure wipes out the existing contents of Database.
             Once the Database is restored you loose the old data and cannot be recovered.
          "Do you like to continue with Database restore:[Y(yes)/N(no)]. (Default:N)"
          

          You must answer the above question to continue the restoration. After entering Y or yes, the output continues as follows:

          [Fri 26 Jul 2019 02:37:50 PM UTC] - Able to find cassandra pod: cassandra-0
          [Fri 26 Jul 2019 02:37:50 PM UTC] - Continuing with the procedure ...
          [Fri 26 Jul 2019 02:37:50 PM UTC] - Backup local directory:/tmp/backuprestore/ exists....
          [Fri 26 Jul 2019 02:37:50 PM UTC] - Removing any stale restore directories ...
          Copying the file for restore to cassandra pod ....
          [Fri 26 Jul 2019 02:37:50 PM UTC] - Able to copy the local directory contents to cassandra pod in /tmp/backuprestore/.
          [Fri 26 Jul 2019 02:37:50 PM UTC] - copying the script to cassandra pod in dir:/tmp/backuprestore/....
          Executing the Script for restoring the backup ...
          /tmp/backuprestore//createbackup.sh: line 1: cript: command not found
          [Fri 26 Jul 2019 02:40:12 PM UTC] - Able to exeute /tmp/backuprestore//createbackup.sh script on cassandra pod
          [Fri 26 Jul 2019 02:40:12 PM UTC] - Restore finished successfully!
          

          To restore NetQ on new hardware:

          1. Copy the backup file from /opt/<backup-directory> on the older hardware to the backup directory on the new hardware.

          2. Run the restore script on the new hardware. Replace backup-directory with the name of the directory where the backup file resides.

            cumulus@netq-appliance:~$ sudo /usr/sbin/backuprestore.sh --restore --localdir /opt/<backup-directory>
            

          Post-installation Configurations

          This section describes the various integrations you can configure after installing NetQ.

          LDAP Authentication

          As an administrator, you can integrate the NetQ role-based access control (RBAC) with your lightweight directory access protocol (LDAP) server in on-premises deployments. NetQ maintains control over role-based permissions for the NetQ application. There are two roles, admin and user. With the RBAC integration, LDAP handles account authentication and your directory service (such as Microsoft Active Directory, Kerberos, OpenLDAP, and Red Hat Directory Service). A copy of each account from LDAP is stored in the local NetQ database.

          Integrating with an LDAP server does not prevent you from configuring local accounts (stored and managed in the NetQ database) as well.

          Get Started

          LDAP integration requires information about how to connect to your LDAP server, the type of authentication you plan to use, bind credentials, and, optionally, search attributes.

          Provide Your LDAP Server Information

          To connect to your LDAP server, you need the URI and bind credentials. The URI identifies the location of the LDAP server. It comprises a FQDN (fully qualified domain name) or IP address, and the port of the LDAP server where the LDAP client can connect. For example: myldap.mycompany.com or 192.168.10.2. Typically you use port 389 for connection over TCP or UDP. In production environments, you deploy a secure connection with SSL. In this case, the port used is typically 636. Setting the Enable SSL toggle automatically sets the server port to 636.

          Specify Your Authentication Method

          There are two types of user authentication: anonymous and basic.

          If you are unfamiliar with the configuration of your LDAP server, contact your administrator to ensure you select the appropriate authentication method and credentials.

          Define User Attributes

          You need the following two attributes to define a user entry in a directory:

          Optionally, you can specify the first name, last name, and email address of the user.

          Set Search Attributes

          While optional, specifying search scope indicates where to start and how deep a given user can search within the directory. You specify the data to search for in the search query.

          Search scope options include:

          A typical search query for users could be {userIdAttribute}={userId}.

          Create an LDAP Configuration

          You can configure one LDAP server per bind DN (distinguished name). After you configure LDAP, you can verify the connectivity and save the configuration.

          To create an LDAP configuration:

          1. Expand the Menu. Under Admin, select Management.

          2. Locate the LDAP Server Info card, and click Configure LDAP.

          3. Fill out the LDAP server configuration form according to your particular configuration.

          4. Click Save to complete the configuration, or click Cancel to discard the configuration.

          LDAP config cannot be changed after it is configured. If you need to change the configuration, you must delete the current LDAP configuration and create a new one. Note that if you change the LDAP server configuration, all users created against that LDAP server remain in the NetQ database and continue to be visible, but are no longer viable. You must manually delete those users if you do not want to see them.

          Example LDAP Configurations

          A variety of example configurations are provided here. Scenarios 1-3 are based on using an OpenLDAP or similar authentication service. Scenario 4 is based on using the Active Directory service for authentication.

          Scenario 1: Base Configuration

          In this scenario, we are configuring the LDAP server with anonymous authentication, a User ID based on an email address, and a search scope of base.

          ParameterValue
          Host Server URLldap1.mycompany.com
          Host Server Port389
          AuthenticationAnonymous
          Base DNdc=mycompany,dc=com
          User IDemail
          Search ScopeBase
          Search Query{userIdAttribute}={userId}

          Scenario 2: Basic Authentication and Subset of Users

          In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network operators group, and a limited search scope.

          ParameterValue
          Host Server URLldap1.mycompany.com
          Host Server Port389
          AuthenticationBasic
          Admin Bind DNuid =admin,ou=netops,dc=mycompany,dc=com
          Admin Bind Passwordnqldap!
          Base DNdc=mycompany,dc=com
          User IDUID
          Search ScopeOne Level
          Search Query{userIdAttribute}={userId}

          Scenario 3: Scenario 2 with Widest Search Capability

          In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network administrators group, and an unlimited search scope.

          ParameterValue
          Host Server URL192.168.10.2
          Host Server Port389
          AuthenticationBasic
          Admin Bind DNuid =admin,ou=netadmin,dc=mycompany,dc=com
          Admin Bind Password1dap*netq
          Base DNdc=mycompany, dc=net
          User IDUID
          Search ScopeSubtree
          Search QueryuserIdAttribute}={userId}

          Scenario 4: Scenario 3 with Active Directory Service

          In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the given Active Directory group, and an unlimited search scope.

          ParameterValue
          Host Server URL192.168.10.2
          Host Server Port389
          AuthenticationBasic
          Admin Bind DNcn=netq,ou=45,dc=mycompany,dc=com
          Admin Bind Passwordnq&4mAd!
          Base DNdc=mycompany, dc=net
          User IDsAMAccountName
          Search ScopeSubtree
          Search Query{userIdAttribute}={userId}

          Add LDAP Users to NetQ

          1. Click Menu. Under Admin, select Management.

          2. Locate the User Accounts card, and click Manage.

          3. On the User Accounts tab, click Add User.

          4. Select LDAP User, then enter the user’s ID.

          5. Enter your administrator password, then select Search.

          6. If the user is found, the email address, first, and last name fields are automatically populated. If searching is not enabled on the LDAP server, you must enter the information manually.

            If the fields are not automatically filled in, and searching is enabled on the LDAP server, you might require changes to the mapping file.

          7. Select the role for this user, admin or user.

          8. Enter your admin password, and click Save, or click Cancel to discard the user account.

            LDAP user passwords are not stored in the NetQ database and are always authenticated against LDAP.

          9. Repeat these steps to add additional LDAP users.

          Remove LDAP Users from NetQ

          You can remove LDAP users in the same manner as local users.

          1. Expand the Menu. Under Admin, select Management.

          2. Locate the User Accounts card, and click Manage.

          3. Select the user(s) you want to remove, then select delete.

          If you delete an LDAP user in LDAP it is not automatically deleted from NetQ; however, the login credentials for these LDAP users stop working immediately.

          Integrate NetQ with Grafana

          Switches collect statistics about the performance of their interfaces. The NetQ Agent on each switch collects these statistics every 15 seconds and then sends them to your NetQ Appliance or Virtual Machine.

          NetQ collects statistics for physical interfaces; it does not collect statistics for virtual interfaces, such as bonds, bridges, and VXLANs.

          NetQ displays:

          You can use Grafana, an open source analytics and monitoring tool, to view these statistics. The fastest way to achieve this is by installing Grafana on an application server or locally per user, and then installing the NetQ plugin.

          If you do not have Grafana installed already, refer to grafana.com for instructions on installing and configuring the Grafana tool.

          Install NetQ Plugin for Grafana

          Use the Grafana CLI to install the NetQ plugin. For more detail about this command, refer to the Grafana CLI documentation.

          The Grafana plugin comes unsigned. Before you can install it, you need to update the grafana.ini file then restart the Grafana service:

          1. Edit the /etc/grafana/grafana.ini file and add allow_loading_unsigned_plugins = netq-dashboard under plugins:

            cumulus@netq-appliance:~$ sudo nano /etc/grafana/grafana.ini
            ...
            allow_loading_unsigned_plugins = netq-dashboard
            ...
            
          2. If you are using Grafana v11.0 or later, add support for AngularJS to the same file under security:

            cumulus@netq-appliance:~$ sudo nano /etc/grafana/grafana.ini
            ...
            angular_support_enabled = true
            ...
            
          3. Restart the Grafana service:

            cumulus@netq-appliance:~$ sudo systemctl restart grafana-server.service
            

          Then install the plugin:

          cumulus@netq-appliance:~$ grafana-cli --pluginUrl https://netq-grafana-dsrc.s3-us-west-2.amazonaws.com/NetQ-DSplugin-3.3.1-plus.zip plugins install netq-dashboard
          installing netq-dashboard @
          from: https://netq-grafana-dsrc.s3-us-west-2.amazonaws.com/NetQ-DSplugin-3.3.1-plus.zip
          into: /usr/local/var/lib/grafana/plugins
          
          ✔ Installed netq-dashboard successfully
          

          After installing the plugin, you must restart Grafana, following the steps specific to your implementation.

          Set Up the NetQ Data Source

          Now that you have the plugin installed, you need to configure access to the NetQ data source.

          1. Open the Grafana user interface and log in. Navigate to the Home Dashboard:

            Grafana Home Dashboard
          2. Click Add data source or > Data Sources.

          1. Enter Net-Q in the search box. Alternately, scroll down to the Other category, and select it from there.

          1. Enter Net-Q into the Name field.

          2. Enter the URL used to access the database:

          1. Select procdevstats from the Module dropdown.

          2. Enter your credentials (the ones used to log in).

          3. For NetQ cloud deployments only, if you have more than one premises configured, you can select the premises you want to view, as follows:

            • If you leave the Premises field blank, the first premises name is selected by default

            • If you enter a premises name, that premises is selected for viewing

              Note: If multiple premises are configured with the same name, then the first premises of that name is selected for viewing

          4. Click Save & Test.

          Create Your NetQ Dashboard

          With the data source configured, you can create a dashboard with the transmit and receive statistics of interest to you.

          Create a Dashboard

          1. Click to open a blank dashboard.

          2. Click (Dashboard Settings) at the top of the dashboard.

          Add Variables

          1. Click Variables.

          2. Enter hostname into the Name field.

          3. Enter hostname into the Label field.

          1. Select Net-Q from the Data source list.

          2. Select On Dashboard Load from the Refresh list.

          3. Enter hostname into the Query field.

          4. Click Add.

            You should see a preview at the bottom of the hostname values.

          5. Click Variables to add another variable for the interface name.

          6. Enter ifname into the Name field.

          7. Enter ifname into the Label field.

          1. Select Net-Q from the Data source list.

          2. Select On Dashboard Load from the Refresh list.

          3. Enter ifname into the Query field.

          4. Click Add.

            You should see a preview at the bottom of the ifname values.

          5. Click Variables to add another variable for metrics.

          6. Enter metrics into the Name field.

          7. Enter metrics into the Label field.

          1. Select Net-Q from the Data source list.

          2. Select On Dashboard Load from the Refresh list.

          3. Enter metrics into the Query field.

          4. Click Add.

            You should see a preview at the bottom of the metrics values.

          Add Charts

          1. Now that the variables are defined, click to return to the new dashboard.

          2. Click Add Query.

          1. Select Net-Q from the Query source list.

          2. Select the interface statistic you want to view from the Metric list.

          3. Click the General icon.

          4. Select hostname from the Repeat list.

          5. Set any other parameters around how to display the data.

          6. Return to the dashboard.

          7. Select one or more hostnames from the hostname list.

          8. Select one or more interface names from the ifname list.

          9. Select one or more metrics to display for these hostnames and interfaces from the metrics list.

          The following example shows a dashboard with two hostnames, two interfaces, and one metric selected. The more values you select from the variable options, the more charts appear on your dashboard.

          Analyze the Data

          When you have configured the dashboard, you can start analyzing the data. You can explore the data by modifying the viewing parameters in one of several ways using the dashboard tool set:

          SSO Authentication

          You can integrate your NetQ Cloud deployment with a Microsoft Azure Active Directory (AD) or Google Cloud authentication server to support single sign-on (SSO) to NetQ. NetQ supports integration with SAML (Security Assertion Markup Language), OAuth (Open Authorization), and multi-factor authentication (MFA). Only one SSO configuration can be configured at a time.

          You can create local accounts with default access roles by enabling SSO. After enabling SSO, users logging in for the first time can sign up for SSO through the NetQ login screen or with a link provided by an admin.

          Add SSO Configuration and Accounts

          To integrate your authentication server:

          1. Expand the Main Menu Menu on the NetQ dashboard.

          2. Under Admin, select Management. Locate the SSO Configuration card and select Manage.

          3. Select either SAML or OpenID (which uses OAuth with OpenID Connect)

          4. Specify the parameters:

            You need several pieces of data from your Microsoft Azure or Google account and authentication server to complete the integration.

            sso configuration card with open id configuration

            SSO Organization is typically a company’s name or a department. The name entered in this field will appear in the SSO signup URL.

            Role (either user or admin) is automatically assigned when the account is initalized via SSO login.

            Name is a unique name for the SSO configuration.

            Client ID is the identifier for your resource server.

            Client Secret is the secret key for your resource server.

            Authorization Endpoint is the URL of the authorization application.

            Token Endpoint is the URL of the authorization token.

            After you enter the fields, select Add.

            As indicated, copy the redirect URI (https://api.netq.nvidia.com/netq/auth/v1/sso-callback) into your OpenID Connect configuration.

            Select Test to verify the configuration and ensure that you can log in. If it is not working, you are logged out. Check your specification and retest the configuration until it is working properly.

            Select Close. The card reflects the configuration:

            sso config card displaying an Open ID configuration with a disabled status

            To require users to log in using this SSO configuration, select Change under the “Disabled” status and confirm. The card updates to reflect that SSO is enabled.

            After an admin has configured and enabled SSO, users logging in for the first time can sign up for SSO.

            Admins can also provide users with an SSO signup URL: https://netq.nvidia.com/signup?organization=SSO_Organization

            The SSO organization you entered during the configuration will replace SSO_Organization in the URL.

            You need several pieces of data from your Microsoft Azure or Google account and authentication server to complete the integration.

            sso configuration card with SAML configuration

            SSO Organization is typically a company’s name or a department. The name entered in this field will appear in the SSO signup URL.

            Role (either user or admin) is automatically assigned when the account is initialized via SSO login.

            Name is a unique name for the SSO configuration.

            Login URL is the URL for the authorization server login page.

            Identity Provider Identifier is the name of the authorization server.

            Service Provider Identifier is the name of the application server.

            Email Claim Key is an optional field. When left blank, the email address is captured.

            After you enter the fields, select Add.

            As indicated, copy the redirect URI (https://api.netq.nvidia.com/netq/auth/v1/sso-callback) into your OpenID Connect configuration.

            Select Test to verify the configuration and ensure that you can log in. If it is not working, you are logged out. Check your specification and retest the configuration until it is working properly.

            Select Close. The card reflects the configuration:

            sso config card displaying a SAML configuration with a disabled status

            To require users to log in using this SSO configuration, select Change under the “Disabled” status and confirm. The card updates to reflect that SSO is enabled.

            Select Submit to enable the configuration. The SSO card reflects the “enabled” status.

            After an admin has configured and enabled SSO, users logging in for the first time can sign up for SSO.

            Admins can also provide users with an SSO signup URL: https://netq.nvidia.com/signup?organization=SSO_Organization

            The SSO organization you entered during the configuration will replace SSO_Organization in the URL.

          Modify Configuration

          You can change the specifications for SSO integration with your authentication server at any time, including changing to an alternate SSO type, disabling the existing configuration, or reconfiguring SSO.

          Change SSO Type

          From the SSO Configuration card:

          1. Select Disable, then Yes.

          2. Select Manage then select the desired SSO type and complete the form.

          3. Copy the redirect URL on the success dialog into your identity provider configuration.

          4. Select Test to verify that the login is working. Modify your specification and retest the configuration until it is working properly.

          5. Select Update.

          Disable SSO Configuration

          From the SSO Configuration card:

          1. Select Disable.

          2. Select Yes to disable the configuration, or Cancel to keep it enabled.

          Uninstall NetQ

          This page outlines how to remove the NetQ software from your system server and switches.

          Remove the NetQ Agent and CLI

          Use the apt-get purge command to remove the NetQ Agent or CLI package from a Cumulus Linux switch or an Ubuntu host:

          cumulus@switch:~$ sudo apt-get update
          cumulus@switch:~$ sudo apt-get purge netq-agent netq-apps
          Reading package lists... Done
          Building dependency tree
          Reading state information... Done
          The following packages will be REMOVED:
            netq-agent* netq-apps*
          0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
          After this operation, 310 MB disk space will be freed.
          Do you want to continue? [Y/n] Y
          Creating pre-apt snapshot... 2 done.
          (Reading database ... 42026 files and directories currently installed.)
          Removing netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
          /usr/sbin/policy-rc.d returned 101, not running 'stop netq-agent.service'
          Purging configuration files for netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
          dpkg: warning: while removing netq-agent, directory '/etc/netq/config.d' not empty so not removed
          Removing netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
          /usr/sbin/policy-rc.d returned 101, not running 'stop netqd.service'
          Purging configuration files for netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
          dpkg: warning: while removing netq-apps, directory '/etc/netq' not empty so not removed
          Processing triggers for man-db (2.7.0.2-5) ...
          grep: extra.services.enabled: No such file or directory
          Creating post-apt snapshot... 3 done.
          

          If you only want to remove the agent or the CLI, but not both, specify just the relevant package in the apt-get purge command.

          To verify the removal of the packages from the switch, run:

          cumulus@switch:~$ dpkg-query -l netq-agent
          dpkg-query: no packages found matching netq-agent
          cumulus@switch:~$ dpkg-query -l netq-apps
          dpkg-query: no packages found matching netq-apps
          

          Use the yum remove command to remove the NetQ agent or CLI package from a RHEL7 or CentOS host:

          root@rhel7:~# sudo yum remove netq-agent netq-apps
          Loaded plugins: fastestmirror
          Resolving Dependencies
          --> Running transaction check
          ---> Package netq-agent.x86_64 0:3.1.0-rh7u28~1594097110.8f00ba1 will be erased
          --> Processing Dependency: netq-agent >= 3.2.0 for package: cumulus-netq-3.1.0-rh7u28~1594097110.8f00ba1.x86_64
          --> Running transaction check
          ---> Package cumulus-netq.x86_64 0:3.1.0-rh7u28~1594097110.8f00ba1 will be erased
          --> Finished Dependency Resolution
          
          Dependencies Resolved
          
          ...
          
          Removed:
            netq-agent.x86_64 0:3.1.0-rh7u28~1594097110.8f00ba1
          
          Dependency Removed:
            cumulus-netq.x86_64 0:3.1.0-rh7u28~1594097110.8f00ba1
          
          Complete!
          
          

          If you only want to remove the agent or the CLI, but not both, specify just the relevant package in the yum remove command.

          To verify the removal of the packages from the switch, run:

          root@rhel7:~# rpm -q netq-agent
          package netq-agent is not installed
          root@rhel7:~# rpm -q netq-apps
          package netq-apps is not installed
          

          Uninstall NetQ from the System Server

          First remove the data collected to free up used disk space. Then remove the software.

          1. Log on to the NetQ system server.

          2. Remove the data:

          netq bootstrap reset purge-db
          
          1. Remove the software with apt-get purge:
          cumulus@switch:~$ sudo apt-get update
          cumulus@switch:~$ sudo apt-get purge netq-agent netq-apps
          
          1. Verify the removal of the packages from the switch:
          cumulus@switch:~$ dpkg-query -l netq-agent
          dpkg-query: no packages found matching netq-agent
          cumulus@switch:~$ dpkg-query -l netq-apps
          dpkg-query: no packages found matching netq-apps
          
          1. Delete the virtual machine according to the usual VMware or KVM practice.

          Delete a virtual machine from the host computer using one of the following methods:

          • Right-click the name of the virtual machine in the Favorites list, then select Delete from Disk.
          • Select the virtual machine and choose VM > Delete from disk.

          Delete a virtual machine from the host computer using one of the following methods:

          • Run virsch undefine <vm-domain> --remove-all-storage
          • Run virsh undefine <vm-domain> --wipe-storage

          Configuration Management

          The topics in this section provide instructions for admins responsible for managing user accounts, physical and software inventory, events and notifications, and lifecycle management (LCM).

          User Management

          As an admin, you can manage users and authentication settings from the NetQ management dashboard.

          Lifecycle Management

          Lifecycle management is enabled for on-premises deployments by default and disabled for cloud deployments by default. Contact your local NVIDIA sales representative or submit a support ticket to activate LCM on cloud deployments. Only administrative users can perform the tasks described in this topic.

          Using the NetQ UI or CLI, lifecycle management (LCM) allows you to:

          Access Lifecycle Management in the UI

          To access LCM, open the Manage Switch Assets page in one of the following ways:

          The Manage Switch Assets view provides access to switch management, image management, NetQ Agent configurations, and job history.

          dashboard displaying switch management tab

          LCM Summary

          To manage the various lifecycle management features using the NetQ CLI, use the netq lcm command set. The following table summarizes LCM’s capabilities:

          Function
          Description
          NetQ UI Cards
          NetQ CLI Commands
          Switch ManagementDiscover switches, view switch inventory, assign roles, set user access credentials, perform software installation and upgrade networkwide
          • Switches
          • Access
          • netq lcm show switches
          • netq lcm add role
          • netq lcm upgrade
          • netq lcm add/del/show credentials
          • netq lcm discover
          Image ManagementView, add, and remove images for software installation and upgrade
          • Cumulus Linux Images
          • NetQ Images
          • netq lcm add/del/show netq-image
          • netq lcm add/del/show cl-images
          • netq lcm add/show default-version
          Job HistoryView the results of installation, upgrade, and configuration assignment jobs
          • CL Upgrade History
          • NetQ Install and Upgrade History
          • Config Assignment History
          • netq lcm show status
          • netq lcm show upgrade-jobs

          NetQ and Network OS Images

          NetQ and network OS (Cumulus Linux and SONiC) images are managed with LCM. This section details how to check for missing images, upgrade images, and specify default images.

          View and Upload Missing Images

          You should upload images for each network OS and NetQ version currently installed in your inventory so you can support rolling back to a known good version should an installation or upgrade fail. If you have specified a default network OS and/or NetQ version, the NetQ UI also verifies that the necessary versions of the default image are available based on the known switch inventory, and if not, lists those that are missing.

          To upload missing network OS images:

          1. Expand the Menu. Under Admin, select Manage Switches. Select the Image Management tab.

          2. On the Cumulus Linux Images card, select View # missing CL images to see which images you need.

          cumulus linux images card with link to view missing images

          If you have already specified a default image, you must click Manage and then Missing to see the missing images.

          1. Select one or more of the missing images and make note of the version, ASIC vendor, and CPU architecture for each.

          2. Download the network OS disk images (.bin files) from the NVIDIA Enterprise Support Portal. Log in to the portal and from the Downloads tab, select Switches and Gateways. Under Switch Software, click All downloads next to Cumulus Linux for Mellanox Switches. Select the current version and the target version, then click Show Downloads Path. Download the file.

          3. Back in the UI, select (Add Image) above the table.

          dialog prompting the user to import the CL image
          1. Provide the .bin file from an external drive that matches the criteria for the selected image(s).

          2. Click Import.

          If the upload was not successful, an Image Import Failed message appears. Close the dialog and try uploading the file again.
          1. Click Done.

          2. (Optional) Click Uploaded to verify the image is in the repository.

          UI screen verifying that the image is in the repository
          1. Click close to return to the LCM dashboard.

            The Cumulus Linux Images card now reflects the number of images you uploaded.

          1. (Optional) Display a summary of Cumulus Linux images uploaded to the LCM repo on the NetQ appliance or VM:
          netq lcm show cl-images
          
          1. Download the network OS disk images (.bin files) from the NVIDIA Enterprise Support Portal. Log into the portal and from the Downloads tab, select Switches and Gateways. Under Switch Software, click All downloads next to Cumulus Linux for Mellanox Switches. Select the current version and the target version, then click Show Downloads Path. Download the file.

          2. Upload the images to the LCM repository. The following example uses a Cumulus Linux 4.2.0 disk image.

            cumulus@switch:~$ netq lcm add cl-image /path/to/download/cumulus-linux-4.2.0-mlnx-amd64.bin
            
          3. Repeat step 2 for each image you need to upload to the LCM repository.

          To upload missing NetQ images:

          1. Expand the Menu. Under Admin, select Manage Switches. Select the Image Management tab.

          2. On the NetQ Images card, select View # missing NetQ images to see which images you need.

          netq images card with link to view missing images

          If you have already specified a default image, you must click Manage and then Missing to see the missing images.

          1. Select one or all of the missing images and make note of the OS version, CPU architecture, and image type. Remember that you need both netq-apps and netq-agent for NetQ to perform the installation or upgrade.

          2. Download the NetQ Debian packages needed for upgrade from the NetQ repository, selecting the appropriate OS version and architecture. Place the files in an accessible part of your local network.

          3. Back in the UI, click (Add Image) above the table.

          dialog prompting the user to import the NetQ images
          1. Provide the .deb file(s) from an external drive that matches the criteria for the selected image.

          2. Click Import.

          If the upload was not successful, an Image Import Failed message appears. Close the Import Image dialog and try uploading the file again.
          1. Click Done.

          2. (Optional) Click Uploaded to verify the images are in the repository.

          3. Click to return to the LCM dashboard.

          The NetQ Images card now shows the number of images you uploaded.

          1. (Optional) Display a summary of NetQ images uploaded to the LCM repo on the NetQ appliance or VM:
          netq lcm show netq-images
          
          1. Download the NetQ Debian packages needed for upgrade from the NetQ repository, selecting the appropriate version and hypervisor/platform. Place them in an accessible part of your local network.

          2. Upload the images to the LCM repository. This example uploads the two packages (netq-agent and netq-apps) needed for NetQ version 4.0.0 for a NetQ appliance or VM running Ubuntu 18.04 with an x86 architecture.

            cumulus@switch:~$ netq lcm add netq-image /path/to/download/netq-agent_4.0.0-ub18.04u33~1614767175.886b337_amd64.deb
            cumulus@switch:~$ netq lcm add netq-image /path/to/download/netq-apps_4.0.0-ub18.04u33~1614767175.886b337_amd64.deb
            

          Upload Upgrade Images

          To upload the network OS or NetQ images that you want to use for upgrade, first download the Cumulus Linux or SONiC disk images (.bin files) and NetQ Debian packages needed for upgrade from the NVIDIA Enterprise Support Portal and NetQ repository, respectively. Place them in an accessible part of your local network.

          If you are upgrading the network OS on switches with different ASIC vendors or CPU architectures, you need more than one image. For NetQ, you need both the netq-apps and netq-agent packages for each variant.

          After obtaining the images, upload them to NetQ with the UI or CLI:

          1. Click Image Management.

          2. Click Add Image on the Cumulus Linux Images or NetQ Images card.

          3. Provide one or more images from an external drive.

          4. Click Import.

          5. Monitor the progress until it completes. Click Done.

          6. Click to return to the LCM dashboard.

          Use the netq lcm add cl-image <text-image-path> and netq lcm add netq-image <text-image-path> commands to upload the images. Run the relevant command for each image that needs to be uploaded.

          Network OS images:

          cumulus@switch:~$ netq lcm add image /path/to/download/cumulus-linux-4.2.0-mlx-amd64.bin
          

          NetQ images:

          cumulus@switch:~$ netq lcm add image /path/to/download/	netq-agent_4.0.0-ub18.04u33~1614767175.886b337_amd64.deb
          cumulus@switch:~$ netq lcm add image /path/to/download/netq-apps_4.0.0-ub18.04u33~1614767175.886b337_amd64.deb
          

          Specify a Default Upgrade Version

          Specifying a default upgrade version is optional, but recommended. You can assign a specific OS or NetQ version as the default version to use when installing or upgrading switches. The default is typically the newest version that you intend to install or upgrade on all, or the majority, of your switches. If necessary, you can override the default selection during the installation or upgrade process if an alternate version is needed for a given set of switches.

          To specify a default version in the NetQ UI:

          1. Click Image Management.

          2. Select the link in the relevant card.

            card highlighting link to set default CL version card highlighting link to set default NetQ version

          3. Select the version you want to use as the default for switch upgrades.

          4. Click Save. The default version is now displayed on the relevant Images card.

          To specify a default network OS version, run:

          cumulus@switch:~$ netq lcm add default-version cl-images <text-cumulus-linux-version>
          

          To specify a default NetQ version, run:

          cumulus@switch:~$ netq lcm add default-version netq-images <text-netq-version>
          

          In the CLI, you can check which version of the network OS or NetQ is the default.

          Remove Images from Local Repository

          After you upgrade all your switches beyond a particular release, you can remove images from the LCM repository to save space on the server. To remove images:

          1. Expand the Menu. Under Admin, select Manage Switches. Select the Image Management tab.

          2. Click Manage on the Cumulus Linux Images or NetQ Images card.

          3. On the Uploaded tab, select the images you want to remove.

          4. Click .

          To remove Cumulus Linux images, run:

          netq lcm show cl-images [json]
          netq lcm del cl-image <text-image-id>
          
          1. Determine the ID of the image you want to remove.

            cumulus@switch:~$ netq lcm show cl-images json
            [
                {
                    "id": "image_cc97be3955042ca41857c4d0fe95296bcea3e372b437a535a4ad23ca300d52c3",
                    "name": "cumulus-linux-4.2.0-vx-amd64-1594775435.dirtyzc24426ca.bin",
                    "clVersion": "4.2.0",
                    "cpu": "x86_64",
                    "asic": "VX",
                    "lastChanged": 1600726385400.0
                },
                {
                    "id": "image_c6e812f0081fb03b9b8625a3c0af14eb82c35d79997db4627c54c76c973ce1ce",
                    "name": "cumulus-linux-4.1.0-vx-amd64.bin",
                    "clVersion": "4.1.0",
                    "cpu": "x86_64",
                    "asic": "VX",
                    "lastChanged": 1600717860685.0
                }
            ]
            
          2. Remove the image you no longer need.

            cumulus@switch:~$ netq lcm del cl-image image_c6e812f0081fb03b9b8625a3c0af14eb82c35d79997db4627c54c76c973ce1ce
            
          3. Verify the command removed the image.

            cumulus@switch:~$ netq lcm show cl-images json
            [
                {
                    "id": "image_cc97be3955042ca41857c4d0fe95296bcea3e372b437a535a4ad23ca300d52c3",
                    "name": "cumulus-linux-4.2.0-vx-amd64-1594775435.dirtyzc24426ca.bin",
                    "clVersion": "4.2.0",
                    "cpu": "x86_64",
                    "asic": "VX",
                    "lastChanged": 1600726385400.0
                }
            ]
            

          To remove NetQ images, run:

          netq lcm show netq-images [json]
          netq lcm del netq-image <text-image-id>
          
          1. Determine the ID of the image you want to remove.

            cumulus@switch:~$ netq lcm show netq-images json
            [
                {
                    "id": "image_d23a9e006641c675ed9e152948a9d1589404e8b83958d53eb0ce7698512e7001",
                    "name": "netq-agent_4.0.0-cl4u32_1609391187.7df4e1d2_amd64.deb",
                    "netqVersion": "4.0.0",
                    "clVersion": "cl4u32",
                    "cpu": "x86_64",
                    "imageType": "NETQ_AGENT",
                    "lastChanged": 1609885430638.0
                }, 
                {
                    "id": "image_68db386683c796d86422f2172c103494fef7a820d003de71647315c5d774f834",
                    "name": "netq-apps_4.0.0-cl4u32_1609391187.7df4e1d2_amd64.deb",
                    "netqVersion": "4.0.0",
                    "clVersion": "cl4u32",
                    "cpu": "x86_64",
                    "imageType": "NETQ_CLI",
                    "lastChanged": 1609885434704.0
                }
            ]
            
          2. Remove the image you no longer need.

            cumulus@switch:~$ netq lcm del netq-image image_68db386683c796d86422f2172c103494fef7a820d003de71647315c5d774f834
            
          3. Verify the command removed the image.

            cumulus@switch:~$ netq lcm show netq-images json
            [
                {
                    "id": "image_d23a9e006641c675ed9e152948a9d1589404e8b83958d53eb0ce7698512e7001",
                    "name": "netq-agent_4.0.0-cl4u32_1609391187.7df4e1d2_amd64.deb",
                    "netqVersion": "4.0.0",
                    "clVersion": "cl4u32",
                    "cpu": "x86_64",
                    "imageType": "NETQ_AGENT",
                    "lastChanged": 1609885430638.0
                }
            ]
            

          Switch Credentials

          You must have switch access credentials to install and upgrade software on a switch. You can choose between basic authentication (SSH username/password) and SSH (Public/Private key) authentication. These credentials apply to all switches. If some of your switches have alternate access credentials, you must change them or modify the credential information before attempting installations or upgrades via lifecycle management.

          Specify Switch Credentials

          Switch access credentials are not specified by default and must be added.

          To specify access credentials:

          1. Expand the Menu. Under Admin, select Manage Switches.

          2. Click the Click here to add Switch access link on the Access card:

          access card with highlighted link
          1. Select the authentication method you want to use: SSH or Basic Authentication.

          Be sure to use credentials for an account that has permission to configure switches.

          The default credentials for Cumulus Linux have changed from cumulus/CumulusLinux! to cumulus/cumulus for releases 4.2 and later. For details, read Cumulus Linux User Accounts.

          1. Enter a username and password.

          2. Click Save.

            The Access card now indicates your credential configuration:

          access card displaying basic credential configuration

          You must have sudoer permission to properly configure switches when using the SSH key method.

          1. Create a pair of SSH private and public keys:

            ssh-keygen -t rsa -C "<USER>"
            
          2. Copy the SSH public key to each switch that you want to upgrade using one of the following methods:

            • Manually copy the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
            • Run ssh-copy-id USER@<switch_ip> on the server where you generated the SSH key pair for each switch
          3. Copy the SSH private key into the entry field in the Create Switch Access card:

          card displaying private key pasted into field

          For security, your private key is stored in an encrypted format, and only provided to internal processes while encrypted.

          The Access card now indicates your credential configuration:

          access card displaying SSH credential configuration

          To configure basic authentication, run:

          cumulus@switch:~$ netq lcm add credentials username cumulus password cumulus
          

          The default credentials for Cumulus Linux have changed from cumulus/CumulusLinux! to cumulus/cumulus for releases 4.2 and later. For details, read Cumulus Linux User Accounts.

          To configure SSH authentication using a public/private key:

          You must have sudoer permission to properly configure switches when using the SSH Key method.

          1. If the keys do not yet exist, create a pair of SSH private and public keys.

            ssh-keygen -t rsa -C "<USER>"
            
          2. Copy the SSH public key to each switch that you want to upgrade using one of the following methods:

            • Manually copy the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
            • Run ssh-copy-id USER@<switch_ip> on the server where you generated the SSH key pair for each switch
          3. Add these credentials to the switch.

            cumulus@switch:~$ netq lcm add credentials ssh-key PUBLIC_SSH_KEY
            

          View Switch Credentials

          You can view the type of credentials used to access your switches in the NetQ UI. You can view the details of the credentials using the NetQ CLI.

          1. Open the LCM dashboard.

          2. On the Access card, select either Basic or SSH.

          To see the credentials, run netq lcm show credentials.

          If you use an SSH key for the credentials, the public key appears in the command output:

          cumulus@switch:~$ netq lcm show credentials
          Type             SSH Key        Username         Password         Last Changed
          ---------------- -------------- ---------------- ---------------- -------------------------
          SSH              MY-SSH-KEY                                       Tue Apr 28 19:08:52 2020
          

          If you use a username and password for the credentials, the username appears in the command output with the password masked:

          cumulus@switch:~$ netq lcm show credentials
          Type             SSH Key        Username         Password         Last Changed
          ---------------- -------------- ---------------- ---------------- -------------------------
          BASIC                           cumulus          **************   Tue Apr 28 19:10:27 2020
          

          Modify Switch Credentials

          To change your access credentials:

          1. Open the LCM dashboard.

          2. On the Access card, click the Click here to change access mode link in the center of the card.

          3. Select the authentication method you want to use: SSH or Basic Authentication.

          4. Based on your selection:

            • Basic: Enter a new username and/or password
            • SSH: Copy and paste a new SSH private key
          5. Click Save.

          To change the basic authentication credentials, run the add credentials command with the new username and/or password. This example changes the password for the cumulus account created above:

          cumulus@switch:~$ netq lcm add credentials username cumulus password Admin#123
          

          To configure SSH authentication using a public/private key:

          You must have sudoer permission to properly configure switches when using the SSH Key method.

          1. If the new keys do not yet exist, create a pair of SSH private and public keys:

            ssh-keygen -t rsa -C "<USER>"
            
          2. Copy the SSH public key to each switch that you want to upgrade using one of the following methods:

            • Manually copy the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
            • Run ssh-copy-id USER@<switch_ip> on the server where you generated the SSH key pair for each switch
          3. Add these new credentials to the switch:

            cumulus@switch:~$ netq lcm add credentials ssh-key PUBLIC_SSH_KEY
            

          Remove Switch Credentials

          You can remove the access credentials for switches using the NetQ CLI. Note that without valid credentials, you cannot upgrade your switches.

          To remove the credentials, run netq lcm del credentials. Verify their removal by running netq lcm show credentials.

          Switch Inventory and Roles

          Upon installation, lifecycle management displays an inventory of switches that are available for software installation or upgrade through NetQ. This includes all switches running Cumulus Linux 3.7.12 or later, SONiC 202012 and 202106, and NetQ Agent 4.1.0 or later in your network. You can assign network roles to switches and select switches for software installation and upgrades from this inventory listing.

          View the LCM Switch Inventory

          The Switches card displays the number of switches that NetQ discovered and the network OS versions that are running on those switches:

          switches card displaying 12 discovered switches with Cumulus Linux version 4.1.0

          To view a list of all discovered switches, select Manage on the Switches card.

          Review the list:

          • Sort the list by any column; hover over column title and click to toggle between ascending and descending order
          • Filter the list: click Filter Switch List and enter parameter value of interest

          If you have more than one network OS version running on your switches, you can click a version segment on the Switches card graph to open a list of switches pre-filtered by that version.

          To view a list of all switches discovered by lifecycle management, run:

          netq lcm show switches [version <text-cumulus-linux-version>] [json]
          

          Use the version option to only show switches with a given network OS version, X.Y.Z.

          The following example shows all switches discovered by lifecycle management:

          cumulus@switch:~$ netq lcm show switches
          Hostname          Role       IP Address                MAC Address        CPU      CL Version           NetQ Version             Last Changed
          ----------------- ---------- ------------------------- ------------------ -------- -------------------- ------------------------ -------------------------
          leaf01            leaf       192.168.200.11            44:38:39:00:01:7A  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:37 2020
                                                                                                                  104fb9ed
          spine04           spine      192.168.200.24            44:38:39:00:01:6C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:16 2020
                                                                                                                  104fb9ed
          leaf03            leaf       192.168.200.13            44:38:39:00:01:84  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:56 2020
                                                                                                                  104fb9ed
          leaf04            leaf       192.168.200.14            44:38:39:00:01:8A  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:07 2020
                                                                                                                  104fb9ed
          border02                     192.168.200.64            44:38:39:00:01:7C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:56:49 2020
                                                                                                                  104fb9ed
          border01                     192.168.200.63            44:38:39:00:01:74  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:56:37 2020
                                                                                                                  104fb9ed
          fw2                          192.168.200.62            44:38:39:00:01:8E  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:24:58 2020
                                                                                                                  104fb9ed
          spine01           spine      192.168.200.21            44:38:39:00:01:82  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:07 2020
                                                                                                                  104fb9ed
          spine02           spine      192.168.200.22            44:38:39:00:01:92  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:08 2020
                                                                                                                  104fb9ed
          spine03           spine      192.168.200.23            44:38:39:00:01:70  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:16 2020
                                                                                                                  104fb9ed
          fw1                          192.168.200.61            44:38:39:00:01:8C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:24:58 2020
                                                                                                                  104fb9ed
          leaf02            leaf       192.168.200.12            44:38:39:00:01:78  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:53 2020
                                                                                                                  104fb9ed
          

          This list is the starting point for network OS upgrades or NetQ installations and upgrades. If the switches you want to upgrade are not present in the list, you can:

          Role Management

          You can assign switches one of four roles: superspine, spine, leaf, and exit.

          Switch roles identify switch dependencies and determine the order in which switches are upgraded. The upgrade process begins with switches assigned the superspine role, then continues with the spine switches, leaf switches, exit switches, and finally, switches with no role assigned. Upgrades for all switches with a given role must be successful before the upgrade process for switches with the closest dependent role can begin.

          Role assignment is optional, but recommended. Using roles can prevent switches from becoming unreachable due to dependencies between switches or single attachments. Additionally, when you deploy MLAG pairs, assigned roles avoid upgrade conflicts.

          Assign Roles to Switches

          1. Expand the Menu. Under Admin, select Manage Switches.

          2. On the Switches card, click Manage.

          3. Select one switch or multiple switches to assign to the same role.

          4. Above the table, select Assign Role.

          5. Select the role that applies to the selected switch(es):

          dialog showing role options including superspine, leaf, spine, and exit
          1. Click Assign.

            Note that the Role column is updated with the role assigned to the selected switch(es). To return to the full list of switches, click All.

          table displaying role column with updated switch role assignments
          1. Continue selecting switches and assigning roles until most or all switches have roles assigned.

          To add a role to one or more switches, run:

          netq lcm add role (superspine | spine | leaf | exit) switches <text-switch-hostnames>
          

          For a single switch, run:

          netq lcm add role leaf switches leaf01
          

          To assign multiple switches to the same role, separate the hostnames with commas (no spaces). This example configures leaf01 through leaf04 switches with the leaf role:

          netq lcm add role leaf switches leaf01,leaf02,leaf03,leaf04
          

          View Switch Roles

          1. Expand the Menu. Under Admin, select Manage Switches.

          2. On the Switches card, click Manage. The assigned role appears in the table’s Role column.

          To view all switch roles, run:

          netq lcm show switches [version <text-cumulus-linux-version>] [json]
          

          Use the version option to only show switches with a given network OS version, X.Y.Z.

          This example shows the role of all switches in the Role column of the listing.

          cumulus@switch:~$ netq lcm show switches
          Hostname          Role       IP Address                MAC Address        CPU      CL Version           NetQ Version             Last Changed
          ----------------- ---------- ------------------------- ------------------ -------- -------------------- ------------------------ -------------------------
          leaf01            leaf       192.168.200.11            44:38:39:00:01:7A  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:37 2020
                                                                                                                  104fb9ed
          spine04           spine      192.168.200.24            44:38:39:00:01:6C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:16 2020
                                                                                                                  104fb9ed
          leaf03            leaf       192.168.200.13            44:38:39:00:01:84  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:56 2020
                                                                                                                  104fb9ed
          leaf04            leaf       192.168.200.14            44:38:39:00:01:8A  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:07 2020
                                                                                                                  104fb9ed
          border02                     192.168.200.64            44:38:39:00:01:7C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:56:49 2020
                                                                                                                  104fb9ed
          border01                     192.168.200.63            44:38:39:00:01:74  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:56:37 2020
                                                                                                                  104fb9ed
          fw2                          192.168.200.62            44:38:39:00:01:8E  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:24:58 2020
                                                                                                                  104fb9ed
          spine01           spine      192.168.200.21            44:38:39:00:01:82  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:07 2020
                                                                                                                  104fb9ed
          spine02           spine      192.168.200.22            44:38:39:00:01:92  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:08 2020
                                                                                                                  104fb9ed
          spine03           spine      192.168.200.23            44:38:39:00:01:70  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:25:16 2020
                                                                                                                  104fb9ed
          fw1                          192.168.200.61            44:38:39:00:01:8C  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Tue Sep 29 21:24:58 2020
                                                                                                                  104fb9ed
          leaf02            leaf       192.168.200.12            44:38:39:00:01:78  x86_64   4.1.0                3.2.0-cl4u30~1601410518. Wed Sep 30 21:55:53 2020
                                                                                                                  104fb9ed
          

          Reassign Roles to Switches

          1. Open the LCM dashboard.

          2. On the Switches card, click Manage.

          3. Select the switches with the incorrect role from the list.

          4. Click Assign Role.

          5. Select the correct role. (Note that you can select No Role here as well to remove the role from the switches.)

          6. Click Assign.

          You use the same command to assign a role as you use to change the role.

          For a single switch, run:

          netq lcm add role exit switches border01
          

          To assign multiple switches to the same role, separate the hostnames with commas (no spaces). For example:

          cumulus@switch:~$ netq lcm add role exit switches border01,border02
          

          Export a List of Switches

          1. Open the LCM dashboard.

          2. On the Switches card, click Manage.

          3. Select one or more switches.

          4. Click .

          5. Choose the export file type and click Export:

          dialog prompting user to export data as a CSV or in JSON format

          Use the json option with the netq lcm show switches command to output a list of all switches in the LCM repository. Alternately, output only switches running a particular network OS version by including the version option.

          cumulus@switch:~$ netq lcm show switches json
          
          cumulus@switch:~$ netq lcm show switches version 3.7.11 json
          

          Upgrade NetQ Agent Using LCM

          Lifecycle management lets you upgrade to the latest agent version on switches with an existing NetQ Agent. You can upgrade only the NetQ Agent or both the NetQ Agent and NetQ CLI simultaneously. You can run up to five jobs at the same time; however, a given switch can only appear in one running job at a time.

          Upgrades can be performed with LCM for NetQ Agents versions 2.4.0 and later. For earlier versions, perform a new installation.

          Prepare for a NetQ Agent Upgrade

          Before you upgrade, make sure you have the appropriate files and credentials:

          1. Click (Upgrade) in the workbench header.

          2. Upload the upgrade images.

          3. (Optional) Specify a default upgrade version.

          4. Verify or add switch access credentials.

          1. Verify or add switch access credentials.

          2. Configure switch roles to determine the order in which the switches get upgraded.

          3. Upload the Cumulus Linux upgrade images.

          Perform a NetQ Agent Upgrade

          After you complete the preparation steps, upgrade the NetQ Agents:

          1. In the Switch Management tab, locate the Switches card and click Manage.

          2. Select the switches you want to upgrade. You can filter by role or sort by column heading to narrow down the list.

          3. Click (Upgrade NetQ) above the table and follow the steps in the UI.

          4. Verify that the number of switches selected for upgrade matches your expectation.

          5. Enter a name for the upgrade job. The name can contain a maximum of 22 characters (including spaces).

          6. Review each switch:

            • Is the NetQ Agent version 2.4.0 or later? If not, this switch can only be upgraded through the switch discovery process.
            • Is the configuration profile the one you want to apply? If not, click Change config, then select an alternate profile to apply to all selected switches.

          You can apply different profiles to switches in a single upgrade job by selecting a subset of switches then choosing a different profile. You can also change the profile on a per-switch basis by clicking the current profile link and selecting an alternate one.

          dialog displaying two profiles that can be applied to both multiple and individual switches

          1. Review the summary indicating the number of switches and the configuration profile to be used. If either is incorrect, click Back and review your selections.

          2. Select the version of NetQ Agent for upgrade. If you have designated a default version, keep the Default selection. Otherwise, select an alternate version by clicking Custom and selecting it from the list.

          By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.

          1. NetQ performs several checks to eliminate preventable problems during the upgrade process. When all of the pre-checks pass, click Upgrade to initiate the upgrade.

          To upgrade the NetQ Agent on one or more switches, run:

          netq-image job-name <text-job-name> [netq-version <text-netq-version>] [upgrade-cli True | upgrade-cli False] hostnames <text-switch-hostnames> [config_profile <text-config-profile>]
          

          The following example creates a NetQ Agent upgrade job called upgrade-cl430-nq330. It upgrades the spine01 and spine02 switches with NetQ Agents version 4.1.0.

          cumulus@switch:~$ netq lcm upgrade name upgrade-cl430-nq330 netq-version 4.1.0 hostnames spine01,spine02
          

          Analyze the NetQ Agent Upgrade Results

          After starting the upgrade you can monitor the progress in the NetQ UI. Successful upgrades are indicated by a green . Failed upgrades display error messages indicating the cause of failure.

          To view the progress of upgrade jobs using the CLI, run:

          netq lcm show upgrade-jobs netq-image [json]
          netq lcm show status <text-lcm-job-id> [json]
          
          
          Example show upgrade-jobs command

          You can view the progress of one upgrade job at a time. This requires the job identifier.

          The following example shows all upgrade jobs that are currently running or have completed, and then shows the status of the job with a job identifier of job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7.

          cumulus@switch:~$ netq lcm show upgrade-jobs netq-image json
          [
              {
                  "jobId": "job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7",
                  "name": "Leaf01-02 to NetQ330",
                  "netqVersion": "4.1.0",
                  "overallStatus": "FAILED",
                  "pre-checkStatus": "COMPLETED",
                  "warnings": [],
                  "errors": [],
                  "startTime": 1611863290557.0
              }
          ]
          
          cumulus@switch:~$ netq lcm show status netq-image job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7
          NetQ Upgrade FAILED
          
          Upgrade Summary
          ---------------
          Start Time: 2021-01-28 19:48:10.557000
          End Time: 2021-01-28 19:48:17.972000
          Upgrade CLI: True
          NetQ Version: 4.1.0
          Pre Check Status COMPLETED
          Precheck Task switch_precheck COMPLETED
          	Warnings: []
          	Errors: []
          Precheck Task version_precheck COMPLETED
          	Warnings: []
          	Errors: []
          Precheck Task config_precheck COMPLETED
          	Warnings: []
          	Errors: []
          
          
          Hostname          CL Version  NetQ Version  Prev NetQ Ver Config Profile               Status           Warnings         Errors       Start Time
                                                      sion
          ----------------- ----------- ------------- ------------- ---------------------------- ---------------- ---------------- ------------ --------------------------
          leaf01            4.2.1       4.1.0         3.2.1         ['NetQ default config']      FAILED           []               ["Unreachabl Thu Jan 28 19:48:10 2021
                                                                                                                                   e at Invalid
                                                                                                                                   /incorrect u
                                                                                                                                   sername/pass
                                                                                                                                   word. Skippi
                                                                                                                                   ng remaining
                                                                                                                                   10 retries t
                                                                                                                                   o prevent ac
                                                                                                                                   count lockou
                                                                                                                                   t: Warning:
                                                                                                                                   Permanently
                                                                                                                                   added '192.1
                                                                                                                                   68.200.11' (
                                                                                                                                   ECDSA) to th
                                                                                                                                   e list of kn
                                                                                                                                   own hosts.\r
                                                                                                                                   \nPermission
                                                                                                                                   denied,
                                                                                                                                   please try a
                                                                                                                                   gain."]
          leaf02            4.2.1       4.1.0         3.2.1         ['NetQ default config']      FAILED           []               ["Unreachabl Thu Jan 28 19:48:10 2021
                                                                                                                                   e at Invalid
                                                                                                                                   /incorrect u
                                                                                                                                   sername/pass
                                                                                                                                   word. Skippi
                                                                                                                                   ng remaining
                                                                                                                                   10 retries t
                                                                                                                                   o prevent ac
                                                                                                                                   count lockou
                                                                                                                                   t: Warning:
                                                                                                                                   Permanently
                                                                                                                                   added '192.1
                                                                                                                                   68.200.12' (
                                                                                                                                   ECDSA) to th
                                                                                                                                   e list of kn
                                                                                                                                   own hosts.\r
                                                                                                                                   \nPermission
                                                                                                                                   denied,
                                                                                                                                   please try a
                                                                                                                                   gain."]
          

          Reasons for NetQ Agent Upgrade Failure

          Upgrades can fail at any stage of the process. The following table lists common reasons for upgrade failures:

          ReasonError Message
          Switch is not reachable via SSHData could not be sent to remote host “192.168.0.15.” Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
          Switch is reachable, but user-provided credentials are invalidInvalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
          Upgrade task could not be runFailure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
          Upgrade task failedFailed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
          Retry failed after five attemptsFAILED In all retries to process the LCM Job

          Upgrade Cumulus Linux Using LCM

          LCM lets you upgrade Cumulus Linux on one or more switches in your network through the NetQ UI or the NetQ CLI. You can run up to five upgrade jobs simultaneously; however, a given switch can only appear in one running job at a time.

          You can upgrade Cumulus Linux from:

          When upgrading to Cumulus Linux 5.0 or later, LCM backs up and restores flat file configurations in Cumulus Linux. After you upgrade to Cumulus Linux 5, running NVUE configuration commands replaces any configuration restored by NetQ LCM. See Upgrading Cumulus Linux for additional information.

          LCM does not support Cumulus Linux upgrades when NVUE is enabled.

          How to Upgrade Cumulus Linux Using LCM

          If the NetQ Agent is already installed on the switches you’d like to upgrade, follow the steps below.

          If the NetQ Agent is not installed on the switches you’d like to upgrade, run a switch discovery, then proceed with the upgrade.

          Upgrade Cumulus Linux on Switches With NetQ Agent Installed

          Prepare for a Cumulus Linux Upgrade

          Before you upgrade, make sure you have the appropriate files and credentials:

          1. Click Devices in the workbench header, then click Manage switches.

          2. Upload the Cumulus Linux upgrade images.

          3. (Optional) Specify a default upgrade version.

          4. Verify or add switch access credentials.

          5. (Optional) Assign a role to each switch.

          Your LCM dashboard should look similar to this:

          LCM dashboard displaying uploaded Cumulus Linux images with a specified default version
          1. Create a discovery job to locate Cumulus Linux switches on the network. Use the netq lcm discover command, specifying a single IP address, a range of IP addresses where your switches are located in the network, or a CSV file containing the IP address, and optionally, the hostname and port for each switch on the network. If the port is blank, NetQ uses switch port 22 by default. They can be in any order you like, but the data must match that order.

            cumulus@switch:~$ netq lcm discover ip-range 10.0.1.12 
            NetQ Discovery Started with job id: job_scan_4f3873b0-5526-11eb-97a2-5b3ed2e556db
            
          2. Upload the Cumulus Linux upgrade images.

          3. Verify or add switch access credentials.

          4. (Optional) Assign a role to each switch.

          Perform a Cumulus Linux Upgrade

          After you complete the preparation steps, upgrade Cumulus Linux:

          1. Click Devices in any workbench header, then select Manage switches.

          2. Locate the Switches card and click Manage.

          3. Select the switches you want to upgrade. You can filter by role or sort by column heading to narrow down the list.

          4. Click (Upgrade OS) above the table.

            From this point forward, the software walks you through the upgrade process, beginning with a review of the switches that you selected for upgrade.

          screen displaying 4 switches selected for upgrading
          1. Verify that the switches you selected are included, and that they have the correct IP address and roles assigned.

          If you accidentally included a switch that you do NOT want to upgrade, hover over the switch information card and click to remove it from the upgrade job.

          switch assigned a spine roll with dropdown to change role

          If the role is incorrect or missing, click , then select a role for that switch from the dropdown. Click to discard a role change:

          1. When you are satisfied that the list of switches is accurate for the job, click Next.

          2. Verify that you want to use the default Cumulus Linux or NetQ version for this upgrade job. If not, click Custom and select an alternate image from the list.

          Default CL Version Selected

          Default CL Version Selected

          Custom CL Version Selected

          Custom CL Version Selected

          1. Note that the switch access authentication method, Using global access credentials, indicates you have chosen either basic authentication with a username and password or SSH key-based authentication for all of your switches. Authentication on a per switch basis is not currently available.

          2. Click Next.

          3. Verify the upgrade job options.

            By default, NetQ takes a network snapshot before the upgrade and then one after the upgrade is complete. It also performs a roll back to the original Cumulus Linux version on any server which fails to upgrade.

            You can exclude selected services and protocols from the snapshots. By default, node and services are included, but you can deselect any of the other items. Click on one to remove it; click again to include it. This is helpful when you are not running a particular protocol, or you have concerns about the amount of time it will take to run the snapshot. Note that removing services or protocols from the job might produce non-equivalent results compared with prior snapshots.

            While these options provide a smoother upgrade process and are highly recommended, you have the option to disable these options by clicking No next to one or both options.

          1. Click Next.

          2. After the pre-checks have completed successfully, click Preview. If there are failures, refer to Pre-check Failures.

            These checks verify the following:

            • Selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ Agent upgrade
            • Selected versions of Cumulus Linux and NetQ Agent are valid upgrade paths
            • All mandatory parameters have valid values, including MLAG configurations
            • All switches are reachable
            • The order to upgrade the switches, based on roles and configurations
          1. Review the job preview.

            When all of your switches have roles assigned, this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), the order in which the switches are planned for upgrade (center; upgrade starts from the left), and the post-upgrade tasks status (right).

          Roles assigned

          Roles assigned

          When none of your switches have roles assigned or they are all of the same role, this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), a list of switches planned for upgrade (center), and the post-upgrade tasks status (right).
          All roles the same

          All roles the same

          When some of your switches have roles assigned, any switches without roles get upgraded last and get grouped under the label Stage1.
          Some roles assigned

          Some roles assigned

          1. When you are happy with the job specifications, click Start Upgrade.

          2. Click Yes to confirm that you want to continue with the upgrade, or click Cancel to discard the upgrade job.

          Perform the upgrade using the netq lcm upgrade cl-image command, providing a name for the upgrade job, the Cumulus Linux and NetQ version, and a comma-separated list of the hostname(s) to be upgraded:

          cumulus@switch:~$ netq lcm upgrade cl-image name upgrade-cl430 cl-version 4.3.0 netq-version 4.4.0 hostnames spine01,spine02
          

          Network Snapshot Creation

          You can also generate a network snapshot before and after the upgrade by adding the run-snapshot-before-after option to the command:

          cumulus@switch:~$ netq lcm upgrade cl-image name upgrade-430 cl-version 4.3.0 netq-version 4.4.0 hostnames spine01,spine02,leaf01,leaf02 order spine,leaf run-snapshot-before-after
          

          Restore on an Upgrade Failure

          You can have LCM restore the previous version of Cumulus Linux if the upgrade job fails by adding the run-restore-on-failure option to the command. This is highly recommended.

          cumulus@switch:~$ netq lcm upgrade cl-image name upgrade-430 cl-version 4.3.0 netq-version 4.4.0 hostnames spine01,spine02,leaf01,leaf02 order spine,leaf run-restore-on-failure
          

          Pre-check Failures

          If one or more of the pre-checks fail, resolve the related issue and start the upgrade again. In the NetQ UI these failures appear on the Upgrade Preview page. In the NetQ CLI, it appears in the form of error messages in the netq lcm show upgrade-jobs cl-image command output.

          Pre-check failure messages
          Pre-checkMessageTypeDescriptionCorrective Action
          (1) Switch Order<hostname1> switch cannot be upgraded without isolating <hostname2>, <hostname3> which are connected neighbors. Unable to upgradeWarningSwitches hostname2 and hostname3 get isolated during an upgrade, making them unreachable. These switches are skipped if you continue with the upgrade.Reconfigure hostname2 and hostname3 to have redundant connections, or continue with upgrade knowing that connectivity is lost with these switches during the upgrade process.
          (2) Version CompatibilityUnable to upgrade <hostname> with CL version <#> to <#>ErrorLCM only supports the following Cumulus Linux upgrades:
          • 3.7.12 to later versions of Cumulus Linux 3
          • 3.7.12 or later to 4.2.0 or later versions of Cumulus Linux 4
          • 4.0 to later versions of Cumulus Linux 4
          • 4.4.0 or later to Cumulus Linux 5.0 releases
          • 5.0.0 or later to Cumulus Linux 5.1 releases
          Perform a fresh install of CL.
          Image not uploaded for the combination: CL Version - <x.y.z>, Asic Vendor - <NVIDIA | Broadcom>, CPU Arch - <x86 | ARM >ErrorThe specified Cumulus Linux image is not available in the LCM repositoryUpload missing image. Refer to Upload Images.
          Restoration image not uploaded for the combination: CL Version - <x.y.z>, Asic Vendor - <Mellanox | Broadcom>, CPU Arch - <x86 | ARM >ErrorThe specified Cumulus Linux image needed to restore the switch back to its original version if the upgrade fails is not available in the LCM repository. This applies only when the “Roll back on upgrade failure” job option is selected.Upload missing image. Refer to Upload Images.
          NetQ Agent and NetQ CLI Debian packages are not present for combination: CL Version - <x.y.z>, CPU Arch - <x86 | ARM >ErrorThe specified NetQ packages are not installed on the switch.Upload missing packages. Refer to Install NetQ Agents and Install NetQ CLI.
          Restoration NetQ Agent and NetQ CLI Debian packages are not present for combination: CL Version - <x.y.z>, CPU Arch - <x86 | ARM >ErrorThe specified NetQ packages are not installed on the switch.
          CL version to be upgraded to and current version on switch <hostname> are the same.WarningSwitch is already operating the desired upgrade CL version. No upgrade is required.Choose an alternate CL version for upgrade or remove switch from upgrade job.
          (3) Switch ConnectivityGlobal credentials are not specifiedErrorSwitch access credentials are required to perform a CL upgrade, and they have not been specified.Specify access credentials. Refer to Specify Switch Credentials.
          Switch is not in NetQ inventory: <hostname>ErrorLCM cannot upgrade a switch that is not in its inventory.Verify you have the correct hostname or IP address for the switch.

          Verify the switch has NetQ Agent 4.1.0 or later installed: click Main Menu, then click Agents in the Network section, view Version column. Upgrade NetQ Agents if needed. Refer to Upgrade NetQ Agents.
          Switch <hostname> is rotten. Cannot select for upgrade.ErrorLCM must be able to communicate with the switch to upgrade it.Troubleshoot the connectivity issue and retry upgrade when the switch is fresh.
          Total number of jobs <running jobs count> exceeded Max jobs supported 50ErrorLCM can support a total of 50 upgrade jobs running simultaneously.Wait for the total number of simultaneous upgrade jobs to drop below 50.
          Switch <hostname> is already being upgraded. Cannot initiate another upgrade.ErrorSwitch is already a part of another running upgrade job.Remove switch from current job or wait until the competing job has completed.
          Backup failed in previous upgrade attempt for switch <hostname>.WarningLCM was unable to back up switch during a previously failed upgrade attempt.You could back up the switch manually prior to upgrade if you want to restore the switch after upgrade. Refer to Back Up and Restore NetQ.
          Restore failed in previous upgrade attempt for switch <hostname>.WarningLCM was unable to restore switch after a previously failed upgrade attempt.You might need to restore the switch manually after upgrade. Refer to Back Up and Restore NetQ.
          Upgrade failed in previous attempt for switch <hostname>.WarningLCM was unable to upgrade switch during last attempt.
          (4) MLAG Configurationhostname:<hostname>,reason:<MLAG error message>ErrorAn error in an MLAG configuration has been detected. For example: Backup IP 10.10.10.1 does not belong to peer.Review the MLAG configuration on the identified switch. Refer to Multi-Chassis Link Aggregation - MLAG. Make any needed changes.
          MLAG configuration checks timed outErrorOne or more switches stopped responding to the MLAG checks.
          MLAG configuration checks failedErrorOne or more switches failed the MLAG checks.
          For switch <hostname>, the MLAG switch with Role: secondary and ClagSysmac: <MAC address> does not exist.ErrorIdentified switch is the primary in an MLAG pair, but the defined secondary switch is not in NetQ inventory.Verify the switch has NetQ Agent 4.1.0 or later installed: click Main Menu, then click Agents in the Network section, view Version column. Upgrade NetQ Agent if needed. Refer to Upgrade NetQ Agents. Add the missing peer switch to NetQ inventory.

          Analyze Results

          After starting the upgrade you can monitor the progress of your upgrade job and the final results. While the views are different, essentially the same information is available from either the NetQ UI or the NetQ CLI.

          You can track the progress of your upgrade job from the Preview page or the Upgrade History page of the NetQ UI.

          If you get disconnected while the job is in progress, it might appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.

          Several viewing options are available for monitoring the upgrade job.

          • Monitor the job with full details open on the Preview page:
          Single role

          Single role

          Multiple roles and some without roles

          Multiple roles and some without roles

          Each switch goes through a number of steps. To view these steps, click Details and scroll down as needed. Click collapse the step detail. Click to close the detail popup.
          • Monitor the job with summary information only in the CL Upgrade History page. Open this view by clicking in the full details view:
          This view is refreshed automatically. Click to view what stage the job is in.
          Click to view the detailed view.
          • Monitor the job through the CL Upgrade History card in the Job History tab. Click twice to return to the LCM dashboard. As you perform more upgrades the graph displays the success and failure of each job.
          Click View to return to the Upgrade History page as needed.

          Sample Successful Upgrade

          On successful completion, you can:

          • Compare the network snapshots taken before and after the upgrade.
          Click Compare Snapshots in the detail view.
          Refer to Interpreting the Comparison Data for information about analyzing these results.
          • Download details about the upgrade in the form of a JSON-formatted file, by clicking Download Report.

          • View the changes on the Switches card of the LCM dashboard.

            Click Main Menu Menu, then Upgrade Switches.

          In our example, all switches have been upgraded to Cumulus Linux 3.7.12.

          Sample Failed Upgrade

          If an upgrade job fails for any reason, you can view the associated error(s):

          1. From the CL Upgrade History dashboard, find the job of interest.
          1. Click .

          2. Click .

          Note in this example, all of the pre-upgrade tasks were successful, but backup failed on the spine switches.
          1. To view what step in the upgrade process failed, click and scroll down. Click to close the step list.
          1. To view details about the errors, either double-click the failed step or click Details and scroll down as needed. Click collapse the step detail. Click to close the detail popup.

          To see the progress of current upgrade jobs and the history of previous upgrade jobs, run netq lcm show upgrade-jobs cl-image:

          cumulus@switch:~$ netq lcm show upgrade-jobs cl-image
          Job ID       Name            CL Version           Pre-Check Status                 Warnings         Errors       Start Time
          ------------ --------------- -------------------- -------------------------------- ---------------- ------------ --------------------
          job_cl_upgra Leafs upgr to C 4.2.0                COMPLETED                                                      Fri Sep 25 17:16:10
          de_ff9c35bc4 L410                                                                                                2020
          950e92cf49ac
          bb7eb4fc6e3b
          7feca7d82960
          570548454c50
          cd05802
          job_cl_upgra Spines to 4.2.0 4.2.0                COMPLETED                                                      Fri Sep 25 16:37:08
          de_9b60d3a1f                                                                                                     2020
          dd3987f787c7
          69fd92f2eef1
          c33f56707f65
          4a5dfc82e633
          dc3b860
          job_upgrade_ 3.7.12 Upgrade  3.7.12               WARNING                                                        Fri Apr 24 20:27:47
          fda24660-866                                                                                                     2020
          9-11ea-bda5-
          ad48ae2cfafb
          job_upgrade_ DataCenter      3.7.12               WARNING                                                        Mon Apr 27 17:44:36
          81749650-88a                                                                                                     2020
          e-11ea-bda5-
          ad48ae2cfafb
          job_upgrade_ Upgrade to CL3. 3.7.12               COMPLETED                                                      Fri Apr 24 17:56:59
          4564c160-865 7.12                                                                                                2020
          3-11ea-bda5-
          ad48ae2cfafb
          

          To see details of a particular upgrade job, run netq lcm show status job-ID:

          cumulus@switch:~$ netq lcm show status job_upgrade_fda24660-8669-11ea-bda5-ad48ae2cfafb
          Hostname    CL Version    Backup Status    Backup Start Time         Restore Status    Restore Start Time        Upgrade Status    Upgrade Start Time
          ----------  ------------  ---------------  ------------------------  ----------------  ------------------------  ----------------  ------------------------
          spine02     4.1.0         FAILED           Fri Sep 25 16:37:40 2020  SKIPPED_ON_FAILURE  N/A                   SKIPPED_ON_FAILURE  N/A
          spine03     4.1.0         FAILED           Fri Sep 25 16:37:40 2020  SKIPPED_ON_FAILURE  N/A                   SKIPPED_ON_FAILURE  N/A
          spine04     4.1.0         FAILED           Fri Sep 25 16:37:40 2020  SKIPPED_ON_FAILURE  N/A                   SKIPPED_ON_FAILURE  N/A
          spine01     4.1.0         FAILED           Fri Sep 25 16:40:26 2020  SKIPPED_ON_FAILURE  N/A                   SKIPPED_ON_FAILURE  N/A
          

          To see only Cumulus Linux upgrade jobs, run netq lcm show status cl-image job-ID.

          Post-check Failures

          A successful upgrade can still have post-check warnings. For example, you updated the OS, but not all services are fully up and running after the upgrade. If one or more of the post-checks fail, warning messages appear in the Post-Upgrade Tasks section of the preview. Click the warning category to view the detailed messages.

          Post-check failure messages
          Post-checkMessageTypeDescriptionCorrective Action
          Health of ServicesService <service-name> is missing on Host <hostname> for <VRF default|VRF mgmt>.WarningA given service is not yet running on the upgraded host. For example: Service ntp is missing on Host Leaf01 for VRF default.Wait for up to x more minutes to see if the specified services come up.
          Switch ConnectivityService <service-name> is missing on Host <hostname> for <VRF default|VRF mgmt>.WarningA given service is not yet running on the upgraded host. For example: Service ntp is missing on Host Leaf01 for VRF default.Wait for up to x more minutes to see if the specified services come up.

          Reasons for Upgrade Job Failure

          Upgrades can fail at any of the stages of the process. The following table lists common reasons for upgrade failures:

          ReasonError Message
          Switch is not reachable via SSHData could not be sent to remote host “192.168.0.15.” Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
          Switch is reachable, but user-provided credentials are invalidInvalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
          Upgrade task could not be runFailure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
          Upgrade task failedFailed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
          Retry failed after five attemptsFAILED In all retries to process the LCM Job

          Upgrade Cumulus Linux on Switches Without NetQ Agent Installed

          When you want to update Cumulus Linux on switches without NetQ installed, use the switch discovery feature. The feature browses your network to find all Cumulus Linux switches (with and without NetQ currently installed) and determines the versions of Cumulus Linux and NetQ installed. These results are then used to install or upgrade Cumulus Linux and NetQ on all discovered switches in a single procedure rather than in two steps. You can run up to five jobs simultaneously; however, a given switch can only appear in one running job at a time.

          To discover switches running Cumulus Linux and upgrade Cumulus Linux and NetQ on them:

          1. Click Devices in the workbench header, then click Manage switches.

          2. On the Switches card, click Discover.

          3. Enter a name for the scan.

          1. Choose whether you want to look for switches by entering IP address ranges OR import switches using a comma-separated values (CSV) file.

          If you do not have a switch listing, then you can manually add the address ranges where your switches are located in the network. This has the advantage of catching switches that might have been missed in a file.

          A maximum of 50 addresses can be included in an address range. If necessary, break the range into smaller ranges.

          To discover switches using address ranges:

          1. Enter an IP address range in the IP Range field.

            Ranges can be contiguous, for example 192.168.0.24-64, or non-contiguous, for example 192.168.0.24-64,128-190,235, but they must be contained within a single subnet.

          2. Optionally, enter another IP address range (in a different subnet) by clicking .

            For example, 198.51.100.0-128 or 198.51.100.0-128,190,200-253.

          3. Add additional ranges as needed. Click to remove a range if needed.

          If you decide to use a CSV file instead, the ranges you entered will remain if you return to using IP ranges again.

          To import switches through a CSV file:

          1. Click Browse.

          2. Select the CSV file containing the list of switches.

            The CSV file must include a header containing hostname, ip, and port. They can be in any order you like, but the data must match that order. For example, a CSV file that represents the Cumulus reference topology could look like this:

          or this:

          You must have an IP address in your file, but the hostname is optional. If the port is blank, NetQ uses switch port 22 by default.

          Click Remove if you decide to use a different file or want to use IP address ranges instead. If you entered ranges before selecting the CSV file option, they remain.

          1. Note that you can use the switch access credentials defined in Switch Credentials to access these switches. If you have issues accessing the switches, you might need to update your credentials.

          2. Click Next.

            When the network discovery is complete, NetQ presents the number of Cumulus Linux switches it found. Each switch can be in one of the following categories:

            • Discovered without NetQ: Switches found without NetQ installed
            • Discovered with NetQ: Switches found with some version of NetQ installed
            • Discovered but Rotten: Switches found that are unreachable
            • Incorrect Credentials: Switches found that cannot are unreachable because the provided access credentials do not match those for the switches
            • OS not Supported: Switches found that are running Cumulus Linux version not supported by the LCM upgrade feature
            • Not Discovered: IP addresses which did not have an associated Cumulus Linux switch

            If the discovery process does not find any switches for a particular category, then it does not display that category.

          1. Select which switches you want to upgrade from each category by clicking the checkbox on each switch card.
          1. Click Next.

          2. Verify the number of switches identified for upgrade and the configuration profile to be applied is correct.

          3. Accept the default NetQ version or click Custom and select an alternate version.

          4. By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.

          5. Click Next.

          6. Several checks are performed to eliminate preventable problems during the install process.

          These checks verify the following:

          • Selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ Agent upgrade
          • Selected versions of Cumulus Linux and NetQ Agent are valid upgrade paths
          • All mandatory parameters have valid values, including MLAG configurations
          • All switches are reachable
          • The order to upgrade the switches, based on roles and configurations

          If any of the pre-checks fail, review the error messages and take appropriate action.

          If all of the pre-checks pass, click Install to initiate the job.

          1. Monitor the job progress.

            After starting the upgrade you can monitor the progress from the preview page or the Upgrade History page.

            From the preview page, a green circle with rotating arrows is shown on each switch as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the NetQ Install and Upgrade History page. The job started most recently is shown at the top, and the data is refreshed periodically.

          If you are disconnected while the job is in progress, it might appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.

          Several viewing options are available for monitoring the upgrade job.

          • Monitor the job with full details open:
          • Monitor the job with only summary information in the NetQ Install and Upgrade History page. Open this view by clicking in the full details view; useful when you have multiple jobs running simultaneously
          • Monitor the job through the NetQ Install and Upgrade History card on the LCM dashboard. Click twice to return to the LCM dashboard.
          1. Investigate any failures and create new jobs to reattempt the upgrade.

          If you previously ran a discovery job, you can show the results of that job by running the netq lcm show discovery-job command.

          cumulus@switch:~$ netq lcm show discovery-job job_scan_921f0a40-5440-11eb-97a2-5b3ed2e556db
          Scan COMPLETED
          
          Summary
          -------
          Start Time: 2021-01-11 19:09:47.441000
          End Time: 2021-01-11 19:09:59.890000
          Total IPs: 1
          Completed IPs: 1
          Discovered without NetQ: 0
          Discovered with NetQ: 0
          Incorrect Credentials: 0
          OS Not Supported: 0
          Not Discovered: 1
          
          
          Hostname          IP Address                MAC Address        CPU      CL Version  NetQ Version  Config Profile               Discovery Status Upgrade Status
          ----------------- ------------------------- ------------------ -------- ----------- ------------- ---------------------------- ---------------- --------------
          N/A               10.0.1.12                 N/A                N/A      N/A         N/A           []                           NOT_FOUND        NOT_UPGRADING
          cumulus@switch:~$ 
          

          When the network discovery is complete, NetQ presents the number of Cumulus Linux switches it has found. The output displays their discovery status, which can be one of the following:

          • Discovered without NetQ: Switches found without NetQ installed
          • Discovered with NetQ: Switches found with some version of NetQ installed
          • Discovered but Rotten: Switches found that are unreachable
          • Incorrect Credentials: Switches found that are unreachable because the provided access credentials do not match those for the switches
          • OS not Supported: Switches found that are running Cumulus Linux version not supported by the LCM upgrade feature
          • NOT_FOUND: IP addresses which did not have an associated Cumulus Linux switch

          After you determine which switches you need to upgrade, run the upgrade process as described above.

          Network Snapshots

          Snapshots capture a network’s state—including the services running on the network—at a particular point in time. Comparing snapshots lets you check what (if anything) changed in the network, which can be helpful when upgrading a switch or modifying its configuration. This section outlines how to create, compare, and interpret snapshots.

          Create a Network Snapshot

          To create a snapshot:

          1. From the workbench header, select snapshot, then Create Snapshot:

            modal prompting user to create, compare, view, or delete snapshots
          2. Next, enter the snapshot’s name, time frame, and the elements you’d like included in the snapshot:

            modal prompting user to add name, time frame, and options while creating a snapshot

            To capture the network’s current state, click Now. To capture the network’s state at a previous date and time, click Past, then in the Start Time field, select the calendar icon.

            The Choose options field includes all the elements and services that may run on the network. All are selected by default. Click any element to remove it from the snapshot. Nodes and services are included in all snapshots.

            The Notes field is optional. You can add a note to remind you of the snapshot’s purpose.

          3. Select Finish. The card now appears on your workbench.

          4. When you are finished viewing the snapshot, click Dismiss to remove it from your workbench. You can add it back by selecting snapshot in the header and navigating to the option to view snapshots.

          Compare Network Snapshots

          You can compare the state of your network before and after an upgrade or other configuration change to help avoid unwanted changes to your network’s state.

          To compare network snapshots:

          1. From the workbench header, click snapshot.

          2. Select Compare Snapshots, then select the two snapshots you want to compare.

          3. Click Finish.

          If the snapshot cards are already on your workbench, place the cards side-by-side for a high-level comparison. For a more detailed comparison, click Compare on one of the cards and select a snapshot for comparison from the list.

          Interpreting the Comparison Data

          For each network element with changes, a visualization displays the differences between the two snapshots. Green represents additions, red represents subtractions, and orange represents updates.

          In the following example, Snapshot 3 and Snapshot 4 are being compared. Snapshot 3 has a BGP count of 212 and Snapshot 4 has a BGP count of 186. The comparison also shows 98 BGP updates.

          comparison data displayed for two snapshots

          From this view, you can dismiss the snapshots or select View Details for additional information and to filter and export the data as a JSON file.

          The following table describes the information provided for each element type when changes are present:

          ElementData Descriptions
          BGP
          • Hostname: Name of the host running the BGP session
          • VRF: Virtual route forwarding interface if used
          • BGP Session: Session that was removed or added
          • ASN: Autonomous system number
          CLAG
          • Hostname: Name of the host running the CLAG session
          • CLAG Sysmac: MAC address for a bond interface pair that was removed or added
          Interface
          • Hostname: Name of the host where the interface resides
          • IF Name: Name of the interface that was removed or added
          IP Address
          • Hostname: Name of the host where address was removed or added
          • Prefix: IP address prefix
          • Mask: IP address mask
          • IF Name: Name of the interface that owns the address
          Links
          • Hostname: Name of the host where the link was removed or added
          • IF Name: Name of the link
          • Kind: Bond, bridge, eth, loopback, macvlan, swp, vlan, vrf, or vxlan
          LLDP
          • Hostname: Name of the discovered host that was removed or added
          • IF Name: Name of the interface
          MAC Address
          • Hostname: Name of the host where MAC address resides
          • MAC address: MAC address that was removed or added
          • VLAN: VLAN associated with the MAC address
          Neighbor
          • Hostname: Name of the neighbor peer that was removed or added
          • VRF: Virtual route forwarding interface if used
          • IF Name: Name of the neighbor interface
          • IP address: Neighbor IP address
          Node
          • Hostname: Name of the network node that was removed or added
          OSPF
          • Hostname: Name of the host running the OSPF session
          • IF Name: Name of the associated interface that was removed or added
          • Area: Routing domain for this host device
          • Peer ID: Network subnet address of router with access to the peer device
          Route
          • Hostname: Name of the host running the route that was removed or added
          • VRF: Virtual route forwarding interface associated with route
          • Prefix: IP address prefix
          Sensors
          • Hostname: Name of the host where sensor resides
          • Kind: Power supply unit, fan, or temperature
          • Name: Name of the sensor that was removed or added
          Services
          • Hostname: Name of the host where service is running
          • Name: Name of the service that was removed or added
          • VRF: Virtual route forwarding interface associated with service

          Decommission Switches

          You can decommission a switch or host at any time. You might need to do this when you:

          Decommissioning the switch or host removes information about the switch or host from the NetQ database. When the NetQ Agent restarts at a later date, it sends a connection request back to the database, so NetQ can monitor the switch or host again.

          Decommission from the CLI

          To decommission a switch or host:

          1. On the given switch or host, stop and disable the NetQ Agent service:

            cumulus@switch:~$ sudo systemctl stop netq-agent
            cumulus@switch:~$ sudo systemctl disable netq-agent
            
          2. On the NetQ On-premises or Cloud Appliance or VM, decommission the switch or host:

            cumulus@netq-appliance:~$ netq decommission <hostname-to-decommission>
            

          Decommission from the NetQ UI

          You can decommission a switch or host from the NetQ UI using the Inventory/Devices card. This stops and disables the NetQ Agent service on the device, and decommissions it from the NetQ database.

          1. Expand the Inventory/Devices card to list the devices in the current inventory:
          inventory card displaying 12 hosts and 12 switches
          1. Select the devices to decommission, then select the decommission icon above the table:
          expanded inventory card with one device selected
          1. Confirm the devices to decommission:
          confirmation dialog with a list of devices
          1. Wait for the decommission process to complete, then select Done.

          Manage NetQ Agents

          Run the following commands to view the status of an agent, disable an agent, manage logging, and configure the events the agent collects.

          View NetQ Agent Status

          To view NetQ Agent status, run:

          netq [<hostname>] show agents [fresh | dead | rotten | opta] [around <text-time>] [json]
          

          You can view the status for a given switch, host or NetQ Appliance or Virtual Machine. You can also filter by the status and view the status at a time in the past.

          To view the current status of all NetQ Agents:

          cumulus@switch~:$ netq show agents
          Matching agents records:
          Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
          ----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
          border01          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:54 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:38 2020
          border02          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:57 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:33 2020
          fw1               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:44 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:26 2020
          fw2               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:42 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:22 2020
          leaf01            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 16:49:04 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:10 2020
          leaf02            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:14 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:30 2020
          leaf03            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:37 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:24 2020
          leaf04            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:35 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:13 2020
          oob-mgmt-server   Fresh            yes      3.1.1-ub18.04u29~1599111022.78b9e43  Mon Sep 21 16:43:58 2020  Mon Sep 21 17:55:00 2020  Mon Sep 21 17:55:00 2020   Thu Oct  1 16:07:31 2020
          server01          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:16 2020
          server02          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:24 2020
          server03          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:56 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:12 2020
          server04          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:17 2020
          server05          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:25 2020
          server06          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:21 2020
          server07          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:48 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:28 2020
          server08          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:45 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:31 2020
          spine01           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:20 2020
          spine02           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:33 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:16 2020
          spine03           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:20 2020
          spine04           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:32 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:33 2020
          
          

          To view NetQ Agents that are not communicating, run:

          cumulus@switch~:$ netq show agents rotten
          No matching agents records found
          

          To view NetQ Agent status on the NetQ appliance or VM, run:

          cumulus@switch~:$ netq show agents opta
          Matching agents records:
          Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
          ----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
          netq-ts           Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 16:46:53 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:29:51 2020
          

          View NetQ Agent Configuration

          You can view the current configuration of a NetQ Agent to determine what data it collects and where it sends that data. To view this configuration, run:

          netq config show agent [kubernetes-monitor|loglevel|stats|sensors|frr-monitor|wjh|wjh-threshold|cpu-limit] [json]
          

          This example shows a NetQ Agent in an on-premises deployment, talking to an appliance or VM at 127.0.0.1 using the default ports and VRF. There is no special configuration to monitor Kubernetes, FRR, interface statistics, sensors, or WJH, and there are no limits on CPU usage or change to the default logging level.

          cumulus@switch:~$ netq config show agent
          netq-agent             value      default
          ---------------------  ---------  ---------
          exhibitport
          exhibiturl
          server                 127.0.0.1  127.0.0.1
          cpu-limit              100        100
          agenturl
          enable-opta-discovery  True       True
          agentport              8981       8981
          port                   31980      31980
          vrf                    default    default
          ()
          

          To view the configuration of a particular aspect of a NetQ Agent, use the various options.

          This example show a NetQ Agent configured with a CPU limit of 60%.

          cumulus@switch:~$ netq config show agent cpu-limit
          CPU Quota
          -----------
          60%
          ()
          

          Modify the Configuration of the NetQ Agent on a Node

          The agent configuration commands let you:

          Commands apply to one agent at a time, and you run them on the switch or host where the NetQ Agent resides.

          Add and Remove a NetQ Agent

          Adding or removing a NetQ Agent is to add or remove the IP address (and port and VRF when specified) from the NetQ configuration file (at /etc/netq/netq.yml). This adds or removes the information about the appliance or VM where the agent sends the data it collects.

          To use the NetQ CLI to add or remove a NetQ Agent on a switch or host, run:

          netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
          netq config del agent server
          

          If you want to use a specific port on the appliance or VM, use the port option. If you want the data sent over a particular virtual route interface, use the vrf option.

          This example shows how to add a NetQ Agent and tell it to send the data it collects to the NetQ Appliance or VM at the IPv4 address of 10.0.0.23 using the default port (on-premises = 31980; cloud = 443) and vrf (default).

          cumulus@switch~:$ netq config add agent server 10.0.0.23
          cumulus@switch~:$ netq config restart agent
          

          Disable and Reenable a NetQ Agent

          You can temporarily disable the NetQ Agent on a node. Disabling the NetQ Agent maintains the data already collected in the NetQ database, but stops the NetQ Agent from collecting new data until you reenable it.

          To disable a NetQ Agent, run:

          cumulus@switch:~$ netq config stop agent
          

          To reenable a NetQ Agent, run:

          cumulus@switch:~$ netq config restart agent
          

          Configure a NetQ Agent to Limit Switch CPU Usage

          While not typically an issue, you can restrict the NetQ Agent from using more than a configurable amount of the CPU resources. This setting requires Cumulus Linux versions 3.6.x, 3.7.x or 4.1.0 or later to be running on the switch.

          For more detail about this feature, refer to this Knowledge Base article.

          This example limits a NetQ Agent from consuming more than 40% of the CPU resources on a Cumulus Linux switch.

          cumulus@switch:~$ netq config add agent cpu-limit 40
          cumulus@switch:~$ netq config restart agent
          

          To remove the limit, run:

          cumulus@switch:~$ netq config del agent cpu-limit
          cumulus@switch:~$ netq config restart agent
          

          Configure a NetQ Agent to Collect Data from Selected Services

          You can enable and disable data collection about FRR (FRRouting), Kubernetes, sensors, and WJH (What Just Happened).

          To configure the agent to start or stop collecting FRR data, run:

          cumulus@chassis~:$ netq config add agent frr-monitor
          cumulus@switch:~$ netq config restart agent
          
          cumulus@chassis~:$ netq config del agent frr-monitor
          cumulus@switch:~$ netq config restart agent
          

          To configure the agent to start or stop collecting Kubernetes data, run:

          cumulus@switch:~$ netq config add agent kubernetes-monitor
          cumulus@switch:~$ netq config restart agent
          
          cumulus@switch:~$ netq config del agent kubernetes-monitor
          cumulus@switch:~$ netq config restart agent
          

          To configure the agent to start or stop collecting chassis sensor data, run:

          cumulus@chassis~:$ netq config add agent sensors
          cumulus@switch:~$ netq config restart agent
          
          cumulus@chassis~:$ netq config del agent sensors
          cumulus@switch:~$ netq config restart agent
          

          This command is only valid when run on a chassis, not a switch.

          To configure the agent to start or stop collecting WJH data, run:

          cumulus@chassis~:$ netq config add agent wjh
          cumulus@switch:~$ netq config restart agent
          
          cumulus@chassis~:$ netq config del agent wjh
          cumulus@switch:~$ netq config restart agent
          

          Configure a NetQ Agent to Send Data to a Server Cluster

          If you have a server cluster arrangement for NetQ, you should configure the NetQ Agent to send the data it collects to every server in the cluster.

          To configure the agent to send data to the servers in your cluster, run:

          netq config add agent cluster-servers <text-opta-ip-list> [port <text-opta-port>] [vrf <text-vrf-name>]
          

          You must separate the list of IP addresses by commas (not spaces). You can optionally specify a port or VRF.

          This example configures the NetQ Agent on a switch to send the data to three servers located at 10.0.0.21, 10.0.0.22, and 10.0.0.23 using the rocket VRF.

          cumulus@switch:~$ netq config add agent cluster-servers 10.0.0.21,10.0.0.22,10.0.0.23 vrf rocket
          

          To stop a NetQ Agent from sending data to a server cluster, run:

          cumulus@switch:~$ netq config del agent cluster-servers
          

          Configure Logging to Troubleshoot a NetQ Agent

          The logging level used for a NetQ Agent determines what types of events get logged about the NetQ Agent on the switch or host.

          First, you need to decide what level of logging you want to configure. You can configure the logging level to be the same for every NetQ Agent, or selectively increase or decrease the logging level for a NetQ Agent on a problematic node.

          Logging LevelDescription
          debugSends notifications for all debug, info, warning, and error messages.
          infoSends notifications for info, warning, and error messages (default).
          warningSends notifications for warning and error messages.
          errorSends notifications for errors messages.

          You can view the NetQ Agent log directly. Messages have the following structure:

          <timestamp> <node> <service>[PID]: <level>: <message>

          ElementDescription
          timestampDate and time event occurred in UTC format
          nodeHostname of network node where event occurred
          service [PID]Service and Process IDentifier that generated the event
          levelLogging level assigned for the given event: debug, error, info, or warning
          messageText description of event, including the node where the event occurred

          For example:

          logging message anatomy, including timestamp, node, service, level, and message

          To configure a logging level, follow these steps. This example sets the logging level to debug:

          1. Set the logging level:

            cumulus@switch:~$ netq config add agent loglevel debug
            
          2. Restart the NetQ Agent:

            cumulus@switch:~$ netq config restart agent
            
          3. (Optional) Verify connection to the NetQ appliance or VM by viewing the netq-agent.log messages.

          Disable Agent Logging

          If you set the logging level to debug for troubleshooting, NVIDIA recommends that you either change the logging level to a less verbose mode or disable agent logging when you finish troubleshooting.

          To change the logging level from debug to another level, run:

          cumulus@switch:~$ netq config add agent loglevel [info|warning|error]
          cumulus@switch:~$ netq config restart agent
          

          To disable all logging:

          cumulus@switch:~$ netq config del agent loglevel
          cumulus@switch:~$ netq config restart agent
          

          Change NetQ Agent Polling Data and Frequency

          The NetQ Agent contains a pre-configured set of modular commands that run periodically and send event and resource data to the NetQ appliance or VM. You can fine tune which events the agent can poll and vary frequency of polling using the NetQ CLI.

          For example, if your network is not running OSPF, you can disable the command that polls for OSPF events. Or you can decrease the polling interval for LLDP from the default of 60 seconds to 120 seconds. By not polling for selected data or polling less frequently, you can reduce switch CPU usage by the NetQ Agent.

          Depending on the switch platform, the NetQ Agent might not execute some supported protocol commands. For example, if a switch has no VXLAN capability, then the agent skips all VXLAN-related commands.

          Supported Commands

          To see the list of supported modular commands, run:

          cumulus@switch:~$ netq config show agent commands
           Service Key               Period  Active       Command
          -----------------------  --------  --------  ---------------------------------------------------------------------
          bgp-neighbors                  60  yes       ['/usr/bin/vtysh', '-c', 'show ip bgp vrf all neighbors json']
          evpn-vni                       60  yes       ['/usr/bin/vtysh', '-c', 'show bgp l2vpn evpn vni json']
          lldp-json                     120  yes       /usr/sbin/lldpctl -f json
          clagctl-json                   60  yes       /usr/bin/clagctl -j
          dpkg-query                  21600  yes       dpkg-query --show -f ${Package},${Version},${Status}\n
          ptmctl-json                   120  yes       ptmctl
          mstpctl-bridge-json            60  yes       /sbin/mstpctl showall json
          ports                        3600  yes       Netq Predefined Command
          proc-net-dev                   30  yes       Netq Predefined Command
          agent_stats                   300  yes       Netq Predefined Command
          agent_util_stats               30  yes       Netq Predefined Command
          tcam-resource-json            120  yes       /usr/cumulus/bin/cl-resource-query -j
          btrfs-json                   1800  yes       /sbin/btrfs fi usage -b /
          config-mon-json               120  yes       Netq Predefined Command
          running-config-mon-json        30  yes       Netq Predefined Command
          cl-support-json               180  yes       Netq Predefined Command
          resource-util-json            120  yes       findmnt / -n -o FS-OPTIONS
          smonctl-json                   30  yes       /usr/sbin/smonctl -j
          sensors-json                   30  yes       sensors -u
          ssd-util-json               86400  yes       sudo /usr/sbin/smartctl -a /dev/sda
          ospf-neighbor-json             60  yes       ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all neighbor detail json']
          ospf-interface-json            60  yes       ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all interface json']
          

          The NetQ predefined commands include:

          Modify the Polling Frequency

          You can change the polling frequency (in seconds) of a modular command. For example, to change the polling frequency of the lldp-json command to 60 seconds from its default of 120 seconds, run:

          cumulus@switch:~$ netq config add agent command service-key lldp-json poll-period 60
          Successfully added/modified Command service lldpd command /usr/sbin/lldpctl -f json
          
          cumulus@switch:~$ netq config show agent commands
           Service Key               Period  Active       Command
          -----------------------  --------  --------  ---------------------------------------------------------------------
          bgp-neighbors                  60  yes       ['/usr/bin/vtysh', '-c', 'show ip bgp vrf all neighbors json']
          evpn-vni                       60  yes       ['/usr/bin/vtysh', '-c', 'show bgp l2vpn evpn vni json']
          lldp-json                      60  yes       /usr/sbin/lldpctl -f json
          clagctl-json                   60  yes       /usr/bin/clagctl -j
          dpkg-query                  21600  yes       dpkg-query --show -f ${Package},${Version},${Status}\n
          ptmctl-json                   120  yes       /usr/bin/ptmctl -d -j
          mstpctl-bridge-json            60  yes       /sbin/mstpctl showall json
          ports                        3600  yes       Netq Predefined Command
          proc-net-dev                   30  yes       Netq Predefined Command
          agent_stats                   300  yes       Netq Predefined Command
          agent_util_stats               30  yes       Netq Predefined Command
          tcam-resource-json            120  yes       /usr/cumulus/bin/cl-resource-query -j
          btrfs-json                   1800  yes       /sbin/btrfs fi usage -b /
          config-mon-json               120  yes       Netq Predefined Command
          running-config-mon-json        30  yes       Netq Predefined Command
          cl-support-json               180  yes       Netq Predefined Command
          resource-util-json            120  yes       findmnt / -n -o FS-OPTIONS
          smonctl-json                   30  yes       /usr/sbin/smonctl -j
          sensors-json                   30  yes       sensors -u
          ssd-util-json               86400  yes       sudo /usr/sbin/smartctl -a /dev/sda
          ospf-neighbor-json             60  no        ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all neighbor detail json']
          ospf-interface-json            60  no        ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all interface json']
          

          Disable a Command

          You can disable unnecessary commands. This can help reduce the compute resources the NetQ Agent consumes on the switch. For example, if your network does not run OSPF, you can disable the two OSPF commands:

          cumulus@switch:~$ netq config add agent command service-key ospf-neighbor-json enable False
          Command Service ospf-neighbor-json is disabled
          
          cumulus@switch:~$ netq config show agent commands
           Service Key               Period  Active       Command
          -----------------------  --------  --------  ---------------------------------------------------------------------
          bgp-neighbors                  60  yes       ['/usr/bin/vtysh', '-c', 'show ip bgp vrf all neighbors json']
          evpn-vni                       60  yes       ['/usr/bin/vtysh', '-c', 'show bgp l2vpn evpn vni json']
          lldp-json                      60  yes       /usr/sbin/lldpctl -f json
          clagctl-json                   60  yes       /usr/bin/clagctl -j
          dpkg-query                  21600  yes       dpkg-query --show -f ${Package},${Version},${Status}\n
          ptmctl-json                   120  yes       /usr/bin/ptmctl -d -j
          mstpctl-bridge-json            60  yes       /sbin/mstpctl showall json
          ports                        3600  yes       Netq Predefined Command
          proc-net-dev                   30  yes       Netq Predefined Command
          agent_stats                   300  yes       Netq Predefined Command
          agent_util_stats               30  yes       Netq Predefined Command
          tcam-resource-json            120  yes       /usr/cumulus/bin/cl-resource-query -j
          btrfs-json                   1800  yes       /sbin/btrfs fi usage -b /
          config-mon-json               120  yes       Netq Predefined Command
          running-config-mon-json        30  yes       Netq Predefined Command
          cl-support-json               180  yes       Netq Predefined Command
          resource-util-json            120  yes       findmnt / -n -o FS-OPTIONS
          smonctl-json                   30  yes       /usr/sbin/smonctl -j
          sensors-json                   30  yes       sensors -u
          ssd-util-json               86400  yes       sudo /usr/sbin/smartctl -a /dev/sda
          ospf-neighbor-json             60  no        ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all neighbor detail json']
          ospf-interface-json            60  no        ['/usr/bin/vtysh', '-c', 'show ip ospf vrf all interface json']
          

          Reset to Default

          To revert to the original command settings, run:

          cumulus@switch:~$ netq config agent factory-reset commands
          Netq Command factory reset successful
          

          Inventory Management

          This section describes how to use the NetQ UI and CLI to monitor your inventory from networkwide and device-specific perspectives.

          You can monitor all hardware and software components installed and running on the switches and hosts across the entire network. This is useful for understanding dependencies on various vendors and versions and can help when planning upgrades.

          Networkwide Inventory

          With the NetQ UI and CLI, a user can monitor the inventory on a networkwide basis for all switches, hosts, and DPUs. Inventory includes such items as the number of each device and its operating system. Additional details are available about the hardware and software components on individual switches, such as the motherboard, ASIC, microprocessor, disk, memory, fan and power supply information. The commands and cards available to obtain this type of information help you to answer questions such as:

          To monitor the inventory of a given switch or DPU, refer to Switch Inventory or DPU Inventory.

          Access Networkwide Inventory Data

          The Inventory/Devices card displays networkwide inventory information for all switches, hosts, and DPUs.

          The NetQ CLI displays detailed network inventory information with netq show inventory.

          View Networkwide Inventory Summary

          View the Number of Each Device Type in Your Network

          To view the quantity of devices in your network, open the Inventory/Devices card. The medium-sized card displays operating system distribution across the network in addition to the device count. Hover over items in the chart’s outer circle to view operating system distribution, and hover over items in the chart’s inner circle to view device counts.

          small inventory card displaying 13 switches and 10 hosts
          medium inventory card displaying 15 switches and 10 hosts as a chart

          View All Switches, Hosts, and DPUs

          You can view all stored attributes for all switches, hosts, and DPUs in your network in the full-screen Inventory/Devices card:

          full-screen inventory/devices card displaying a list of switches

          To view a list of devices in your network, run:

          netq show inventory brief [json]
          

          This example shows that there are four spine switches, three leaf switches, two border switches, two firewall switches, seven hosts (servers), and an out-of-band management server in this network. Each entry displays the type of switch, operating system, CPU, and ASIC.

          cumulus@switch:~$ netq show inventory brief
          Matching inventory records:
          Hostname          Switch               OS              CPU      ASIC            Ports
          ----------------- -------------------- --------------- -------- --------------- -----------------------------------
          border01          VX                   CL              x86_64   VX              N/A
          border02          VX                   CL              x86_64   VX              N/A
          fw1               VX                   CL              x86_64   VX              N/A
          fw2               VX                   CL              x86_64   VX              N/A
          leaf01            VX                   CL              x86_64   VX              N/A
          leaf02            VX                   CL              x86_64   VX              N/A
          leaf03            VX                   CL              x86_64   VX              N/A
          oob-mgmt-server   N/A                  Ubuntu          x86_64   N/A             N/A
          server01          N/A                  Ubuntu          x86_64   N/A             N/A
          server02          N/A                  Ubuntu          x86_64   N/A             N/A
          server03          N/A                  Ubuntu          x86_64   N/A             N/A
          server04          N/A                  Ubuntu          x86_64   N/A             N/A
          server05          N/A                  Ubuntu          x86_64   N/A             N/A
          server06          N/A                  Ubuntu          x86_64   N/A             N/A
          server07          N/A                  Ubuntu          x86_64   N/A             N/A
          spine01           VX                   CL              x86_64   VX              N/A
          spine02           VX                   CL              x86_64   VX              N/A
          spine03           VX                   CL              x86_64   VX              N/A
          spine04           VX                   CL              x86_64   VX              N/A
          

          View Networkwide Hardware Inventory

          You can view hardware components deployed on all switches and hosts, or on all switches in your network.

          View Components Summary

          It can be useful to know the quantity and ratio of many components deployed in your network to determine the scope of upgrade tasks, balance vendor reliance, or for detailed troubleshooting.

          1. Locate the Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the large size card using the size picker.

            By default, the Switches tab shows the total number of switches, ASIC vendors, OS versions, NetQ Agent versions, and specific platforms deployed across all your switches.

            You can hover over any of the segments in a component distribution chart to highlight a specific type of the given component. When you hover, a tooltip appears displaying:

            • Name or value of the component type, such as the version number or status
            • Total number of switches with that type of component deployed compared to the total number of switches
            • Percentage of this type as compared to all component types

          To view switch components, run:

          netq show inventory brief [json]
          

          This example shows the operating systems (Cumulus Linux and Ubuntu), CPU architecture (all x86_64), ASIC (virtual), and ports (N/A because Cumulus VX is virtual) for each device in the network.

          cumulus@switch:~$ netq show inventory brief
          Matching inventory records:
          Hostname          Switch               OS              CPU      ASIC            Ports
          ----------------- -------------------- --------------- -------- --------------- -----------------------------------
          border01          VX                   CL              x86_64   VX              N/A
          border02          VX                   CL              x86_64   VX              N/A
          fw1               VX                   CL              x86_64   VX              N/A
          fw2               VX                   CL              x86_64   VX              N/A
          leaf01            VX                   CL              x86_64   VX              N/A
          leaf02            VX                   CL              x86_64   VX              N/A
          leaf03            VX                   CL              x86_64   VX              N/A
          oob-mgmt-server   N/A                  Ubuntu          x86_64   N/A             N/A
          server01          N/A                  Ubuntu          x86_64   N/A             N/A
          server02          N/A                  Ubuntu          x86_64   N/A             N/A
          server03          N/A                  Ubuntu          x86_64   N/A             N/A
          server04          N/A                  Ubuntu          x86_64   N/A             N/A
          server05          N/A                  Ubuntu          x86_64   N/A             N/A
          server06          N/A                  Ubuntu          x86_64   N/A             N/A
          server07          N/A                  Ubuntu          x86_64   N/A             N/A
          spine01           VX                   CL              x86_64   VX              N/A
          spine02           VX                   CL              x86_64   VX              N/A
          spine03           VX                   CL              x86_64   VX              N/A
          spine04           VX                   CL              x86_64   VX              N/A
          

          View ASIC Information

          1. Locate the medium Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the large size card using the size picker.

          3. Click a segment of the ASIC graph in the component distribution charts. Select Filter ASIC:

          1. Alternately, expand the card to full screen to view ASIC information as a column in a table.

          To view information about the ASIC installed on your devices, run:

          netq show inventory asic [vendor <asic-vendor>|model <asic-model>|model-id <asic-model-id>] [json]
          

          If you are running NetQ on a CumulusVX setup, there is no physical hardware to query and thus no ASIC information to display.

          This example shows the ASIC information for all devices in your network:

          cumulus@switch:~$ netq show inventory asic
          Matching inventory records:
          Hostname          Vendor               Model                          Model ID                  Core BW        Ports
          ----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
          dell-z9100-05     Broadcom             Tomahawk                       BCM56960                  2.0T           32 x 100G-QSFP28
          mlx-2100-05       Mellanox             Spectrum                       MT52132                   N/A            16 x 100G-QSFP28
          mlx-2410a1-05     Mellanox             Spectrum                       MT52132                   N/A            48 x 25G-SFP28 & 8 x 100G-QSFP28
          mlx-2700-11       Mellanox             Spectrum                       MT52132                   N/A            32 x 100G-QSFP28
          qct-ix1-08        Broadcom             Tomahawk                       BCM56960                  2.0T           32 x 100G-QSFP28
          qct-ix7-04        Broadcom             Trident3                       BCM56870                  N/A            32 x 100G-QSFP28
          st1-l1            Broadcom             Trident2                       BCM56854                  720G           48 x 10G-SFP+ & 6 x 40G-QSFP+
          st1-l2            Broadcom             Trident2                       BCM56854                  720G           48 x 10G-SFP+ & 6 x 40G-QSFP+
          st1-l3            Broadcom             Trident2                       BCM56854                  720G           48 x 10G-SFP+ & 6 x 40G-QSFP+
          st1-s1            Broadcom             Trident2                       BCM56850                  960G           32 x 40G-QSFP+
          st1-s2            Broadcom             Trident2                       BCM56850                  960G           32 x 40G-QSFP+
          

          You can filter the results of the command to view devices with a particular vendor, model, or modelID. This example shows ASIC information for all devices with a vendor of NVIDIA.

          cumulus@switch:~$ netq show inventory asic vendor NVIDIA
          Matching inventory records:
          Hostname          Vendor               Model                          Model ID                  Core BW        Ports
          ----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
          mlx-2100-05       NVIDIA               Spectrum                       MT52132                   N/A            16 x 100G-QSFP28
          mlx-2410a1-05     NVIDIA               Spectrum                       MT52132                   N/A            48 x 25G-SFP28 & 8 x 100G-QSFP28
          mlx-2700-11       NVIDIA               Spectrum                       MT52132                   N/A            32 x 100G-QSFP28
          

          View Motherboard/Platform Information

          1. Locate the Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. The All Switches tab is active by default. Scroll to the right to view the various Platform parameters for your switches. Optionally drag and drop the relevant columns next to each other.

          1. Click All Hosts.

          2. Scroll to the right to view the various Platform parameters for your hosts. Optionally drag and drop the relevant columns next to each other.

          To view a list of motherboards installed in your switches and hosts, run:

          netq show inventory board [vendor <board-vendor>|model <board-model>] [json]
          

          This example shows all motherboard data for all devices.

          cumulus@switch:~$ netq show inventory board
          Matching inventory records:
          Hostname          Vendor               Model                          Base MAC           Serial No                 Part No          Rev    Mfg Date
          ----------------- -------------------- ------------------------------ ------------------ ------------------------- ---------------- ------ ----------
          dell-z9100-05     DELL                 Z9100-ON                       4C:76:25:E7:42:C0  CN03GT5N779315C20001      03GT5N           A00    12/04/2015
          mlx-2100-05       Penguin              Arctica 1600cs                 7C:FE:90:F5:61:C0  MT1623X10078              MSN2100-CB2FO    N/A    06/09/2016
          mlx-2410a1-05     Mellanox             SN2410                         EC:0D:9A:4E:55:C0  MT1734X00067              MSN2410-CB2F_QP3 N/A    08/24/2017
          mlx-2700-11       Penguin              Arctica 3200cs                 44:38:39:00:AB:80  MT1604X21036              MSN2700-CS2FO    N/A    01/31/2016
          qct-ix1-08        QCT                  QuantaMesh BMS T7032-IX1       54:AB:3A:78:69:51  QTFCO7623002C             1IX1UZZ0ST6      H3B    05/30/2016
          qct-ix7-04        QCT                  IX7                            D8:C4:97:62:37:65  QTFCUW821000A             1IX7UZZ0ST5      B3D    05/07/2018
          qct-ix7-04        QCT                  T7032-IX7                      D8:C4:97:62:37:65  QTFCUW821000A             1IX7UZZ0ST5      B3D    05/07/2018
          st1-l1            CELESTICA            Arctica 4806xp                 00:E0:EC:27:71:37  D2060B2F044919GD000011    R0854-F1004-01   Redsto 09/20/2014
                                                                                                                                              ne-XP
          st1-l2            CELESTICA            Arctica 4806xp                 00:E0:EC:27:6B:3A  D2060B2F044919GD000060    R0854-F1004-01   Redsto 09/20/2014
                                                                                                                                              ne-XP
          st1-l3            Penguin              Arctica 4806xp                 44:38:39:00:70:49  N/A                       N/A              N/A    N/A
          st1-s1            Dell                 S6000-ON                       44:38:39:00:80:00  N/A                       N/A              N/A    N/A
          st1-s2            Dell                 S6000-ON                       44:38:39:00:80:81  N/A                       N/A              N/A    N/A
          

          You can filter the results of the command to capture only those devices with a particular motherboard vendor or model. This example shows only the devices with a Celestica motherboard.

          cumulus@switch:~$ netq show inventory board vendor celestica
          Matching inventory records:
          Hostname          Vendor               Model                          Base MAC           Serial No                 Part No          Rev    Mfg Date
          ----------------- -------------------- ------------------------------ ------------------ ------------------------- ---------------- ------ ----------
          st1-l1            CELESTICA            Arctica 4806xp                 00:E0:EC:27:71:37  D2060B2F044919GD000011    R0854-F1004-01   Redsto 09/20/2014
                                                                                                                                              ne-XP
          st1-l2            CELESTICA            Arctica 4806xp                 00:E0:EC:27:6B:3A  D2060B2F044919GD000060    R0854-F1004-01   Redsto 09/20/2014
                                                                                                                                              ne-XP
          

          View CPU Information

          1. Locate the Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. The All Switches tab is active by default. Scroll to the right to view the various CPU parameters. Optionally drag and drop relevant columns next to each other.

          1. Click All Hosts to view the CPU information for your host servers.

          To view CPU information for all devices in your network, run:

          netq show inventory cpu [arch <cpu-arch>] [json]
          

          This example shows the CPU information for all devices.

          cumulus@switch:~$ netq show inventory cpu
          Matching inventory records:
          Hostname          Arch     Model                          Freq       Cores
          ----------------- -------- ------------------------------ ---------- -----
          dell-z9100-05     x86_64   Intel(R) Atom(TM) C2538        2.40GHz    4
          mlx-2100-05       x86_64   Intel(R) Atom(TM) C2558        2.40GHz    4
          mlx-2410a1-05     x86_64   Intel(R) Celeron(R)  1047UE    1.40GHz    2
          mlx-2700-11       x86_64   Intel(R) Celeron(R)  1047UE    1.40GHz    2
          qct-ix1-08        x86_64   Intel(R) Atom(TM) C2558        2.40GHz    4
          qct-ix7-04        x86_64   Intel(R) Atom(TM) C2558        2.40GHz    4
          st1-l1            x86_64   Intel(R) Atom(TM) C2538        2.41GHz    4
          st1-l2            x86_64   Intel(R) Atom(TM) C2538        2.41GHz    4
          st1-l3            x86_64   Intel(R) Atom(TM) C2538        2.40GHz    4
          st1-s1            x86_64   Intel(R) Atom(TM)  S1220       1.60GHz    4
          st1-s2            x86_64   Intel(R) Atom(TM)  S1220       1.60GHz    4
          

          You can filter the results of the command to view which switches employ a particular CPU architecture using the arch keyword. This example shows how to determine all the currently deployed architectures in your network, and then shows all devices with an x86_64 architecture.

          cumulus@switch:~$ netq show inventory cpu arch
              x86_64  :  CPU Architecture
              
          cumulus@switch:~$ netq show inventory cpu arch x86_64
          Matching inventory records:
          Hostname          Arch     Model                          Freq       Cores
          ----------------- -------- ------------------------------ ---------- -----
          leaf01            x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          leaf02            x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          leaf03            x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          leaf04            x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          oob-mgmt-server   x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          server01          x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          server02          x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          server03          x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          server04          x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          spine01           x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          spine02           x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          

          View Disk Information

          1. Locate the Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          1. The All Switches tab is selected by default. Locate the Disk Total Size column.

          2. Click All Hosts to view the total disk size of all host servers.

          To view disk information for your switches, run:

          netq show inventory disk [name <disk-name>|transport <disk-transport>|vendor <disk-vendor>] [json]
          

          This example shows the disk information for all devices.

          cumulus@switch:~$ netq show inventory disk
          Matching inventory records:
          Hostname          Name            Type             Transport          Size       Vendor               Model
          ----------------- --------------- ---------------- ------------------ ---------- -------------------- ------------------------------
          leaf01            vda             disk             N/A                6G         0x1af4               N/A
          leaf02            vda             disk             N/A                6G         0x1af4               N/A
          leaf03            vda             disk             N/A                6G         0x1af4               N/A
          leaf04            vda             disk             N/A                6G         0x1af4               N/A
          oob-mgmt-server   vda             disk             N/A                256G       0x1af4               N/A
          server01          vda             disk             N/A                301G       0x1af4               N/A
          server02          vda             disk             N/A                301G       0x1af4               N/A
          server03          vda             disk             N/A                301G       0x1af4               N/A
          server04          vda             disk             N/A                301G       0x1af4               N/A
          spine01           vda             disk             N/A                6G         0x1af4               N/A
          spine02           vda             disk             N/A                6G         0x1af4               N/A
          

          View Memory Information

          1. Locate the Inventory/Devices card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          1. The All Switches tab is selected by default. Locate the Memory Size column.

          2. Click All Hosts to view the memory size for all host servers.

          To view memory information for your switches and host servers, run:

          netq show inventory memory [type <memory-type>|vendor <memory-vendor>] [json]
          

          This example shows all memory characteristics for all devices.

          cumulus@switch:~$ netq show inventory memory
          Matching inventory records:
          Hostname          Name            Type             Size       Speed      Vendor               Serial No
          ----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
          dell-z9100-05     DIMM0 BANK 0    DDR3             8192 MB    1600 MHz   Hynix                14391421
          mlx-2100-05       DIMM0 BANK 0    DDR3             8192 MB    1600 MHz   InnoDisk Corporation 00000000
          mlx-2410a1-05     ChannelA-DIMM0  DDR3             8192 MB    1600 MHz   017A                 87416232
                              BANK 0
          mlx-2700-11       ChannelA-DIMM0  DDR3             8192 MB    1600 MHz   017A                 73215444
                              BANK 0
          mlx-2700-11       ChannelB-DIMM0  DDR3             8192 MB    1600 MHz   017A                 73215444
                              BANK 2
          qct-ix1-08        N/A             N/A              7907.45MB  N/A        N/A                  N/A
          qct-ix7-04        DIMM0 BANK 0    DDR3             8192 MB    1600 MHz   Transcend            00211415
          st1-l1            DIMM0 BANK 0    DDR3             4096 MB    1333 MHz   N/A                  N/A
          st1-l2            DIMM0 BANK 0    DDR3             4096 MB    1333 MHz   N/A                  N/A
          st1-l3            DIMM0 BANK 0    DDR3             4096 MB    1600 MHz   N/A                  N/A
          st1-s1            A1_DIMM0 A1_BAN DDR3             8192 MB    1333 MHz   A1_Manufacturer0     A1_SerNum0
                              K0
          st1-s2            A1_DIMM0 A1_BAN DDR3             8192 MB    1333 MHz   A1_Manufacturer0     A1_SerNum0
                              K0
          

          You can filter the results of the command to view devices with a particular memory type or vendor. This example shows all the devices with memory from QEMU .

          cumulus@switch:~$ netq show inventory memory vendor QEMU
          Matching inventory records:
          Hostname          Name            Type             Size       Speed      Vendor               Serial No
          ----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
          leaf01            DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          leaf02            DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          leaf03            DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          leaf04            DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          oob-mgmt-server   DIMM 0          RAM              4096 MB    Unknown    QEMU                 Not Specified
          server01          DIMM 0          RAM              512 MB     Unknown    QEMU                 Not Specified
          server02          DIMM 0          RAM              512 MB     Unknown    QEMU                 Not Specified
          server03          DIMM 0          RAM              512 MB     Unknown    QEMU                 Not Specified
          server04          DIMM 0          RAM              512 MB     Unknown    QEMU                 Not Specified
          spine01           DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          spine02           DIMM 0          RAM              1024 MB    Unknown    QEMU                 Not Specified
          

          View Sensor Information

          Fan, power supply unit (PSU), and temperature sensors are available to provide additional data about the NetQ system operation.

          Power Supply Unit Information

          1. Click Menu, then click Sensors.

          2. The PSU tab is displayed by default.

          PSU ParameterDescription
          HostnameName of the switch or host where the power supply is installed
          TimestampDate and time the data was captured
          Message TypeType of sensor message; always PSU in this table
          PIn(W)Input power (Watts) for the PSU on the switch or host
          POut(W)Output power (Watts) for the PSU on the switch or host
          Sensor NameUser-defined name for the PSU
          Previous StateState of the PSU when data was captured in previous window
          StateState of the PSU when data was last captured
          VIn(V)Input voltage (Volts) for the PSU on the switch or host
          VOut(V)Output voltage (Volts) for the PSU on the switch or host

          Fan Information

          1. Click Menu, then click Sensors in the Network heading.

          2. Click Fan.

          Fan ParameterDescription
          HostnameName of the switch or host where the fan is installed
          TimestampDate and time the data was captured
          Message TypeType of sensor message; always Fan in this table
          DescriptionUser specified description of the fan
          Speed (RPM)Revolution rate of the fan (revolutions per minute)
          MaxMaximum speed (RPM)
          MinMinimum speed (RPM)
          MessageMessage
          Sensor NameUser-defined name for the fan
          Previous StateState of the fan when data was captured in previous window
          StateState of the fan when data was last captured

          Temperature Information

          1. Click Menu, then click Sensors in the Network heading.

          2. Click Temperature.

          Temperature ParameterDescription
          HostnameName of the switch or host where the temperature sensor is installed
          TimestampDate and time the data was captured
          Message TypeType of sensor message; always Temp in this table
          CriticalCurrent critical maximum temperature (°C) threshold setting
          DescriptionUser specified description of the temperature sensor
          Lower CriticalCurrent critical minimum temperature (°C) threshold setting
          MaxMaximum temperature threshold setting
          MinMinimum temperature threshold setting
          MessageMessage
          Sensor NameUser-defined name for the temperature sensor
          Previous StateState of the fan when data was captured in previous window
          StateState of the fan when data was last captured
          Temperature(Celsius)Current temperature (°C) measured by sensor

          View All Sensor Information

          To view information for power supplies, fans, and temperature sensors on all switches and host servers, run:

          netq show sensors all [around <text-time>] [json]
          

          View Only Power Supply Sensors

          To view information from all PSU sensors or PSU sensors with a given name on your switches and host servers, run:

          netq show sensors psu [<psu-name>] [around <text-time>] [json]
          

          Use the psu-name option to view all PSU sensors with a particular name.

          Use Tab completion to determine the names of the PSUs in your switches.

          cumulus@switch:~$ netq show sensors psu <press tab>
          around  :  Go back in time to around ...
          json    :  Provide output in JSON
          psu1    :  Power Supply
          psu2    :  Power Supply
          <ENTER>
          

          This example shows all PSUs with the name psu2.

          cumulus@switch:~$ netq show sensors psu psu2
          Matching sensors records:
          Hostname          Name            State      Message                             Last Changed
          ----------------- --------------- ---------- ----------------------------------- -------------------------
          exit01            psu2            ok                                             Fri Apr 19 16:01:17 2019
          exit02            psu2            ok                                             Fri Apr 19 16:01:33 2019
          leaf01            psu2            ok                                             Sun Apr 21 20:07:12 2019
          leaf02            psu2            ok                                             Fri Apr 19 16:01:41 2019
          leaf03            psu2            ok                                             Fri Apr 19 16:01:44 2019
          leaf04            psu2            ok                                             Fri Apr 19 16:01:36 2019
          spine01           psu2            ok                                             Fri Apr 19 16:01:52 2019
          spine02           psu2            ok                                             Fri Apr 19 16:01:08 2019
          

          View Only Fan Sensors

          To view information from all fan sensors or fan sensors with a given name on your switches and host servers, run:

          netq show sensors fan [<fan-name>] [around <text-time>] [json]
          

          Use tab completion to determine the names of the fans in your switches:

          cumulus@switch:~$ netq show sensors fan <<press tab>>
             around : Go back in time to around ...
             fan1 : Fan Name
             fan2 : Fan Name
             fan3 : Fan Name
             fan4 : Fan Name
             fan5 : Fan Name
             fan6 : Fan Name
             json : Provide output in JSON
             psu1fan1 : Fan Name
             psu2fan1 : Fan Name
             <ENTER>
          

          This example shows the state of all fans with the name fan1.

          cumulus@switch~$ netq show sensors fan fan1
          Matching sensors records:
          Hostname          Name            Description                         State      Speed      Max      Min      Message                             Last Changed
          ----------------- --------------- ----------------------------------- ---------- ---------- -------- -------- ----------------------------------- -------------------------
          border01          fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Tue Aug 25 21:45:21 2020
          border02          fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Tue Aug 25 21:39:36 2020
          fw1               fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 00:08:01 2020
          fw2               fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 00:02:13 2020
          leaf01            fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Tue Aug 25 18:30:07 2020
          leaf02            fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Tue Aug 25 18:08:38 2020
          leaf03            fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Tue Aug 25 21:20:34 2020
          leaf04            fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 14:20:22 2020
          spine01           fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 10:53:17 2020
          spine02           fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 10:54:07 2020
          spine03           fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 11:00:44 2020
          spine04           fan1            fan tray 1, fan 1                   ok         2500       29000    2500                                         Wed Aug 26 10:52:00 2020
          
          

          View Only Temperature Sensors

          To view information from all temperature sensors or temperature sensors with a given name on your switches and host servers, run:

          netq show sensors temp [<temp-name>] [around <text-time>] [json]
          

          Use tab completion to determine the names of the temperature sensors on your devices:

          cumulus@switch:~$ netq show sensors temp <press tab>
              around     :  Go back in time to around ...
              json       :  Provide output in JSON
              psu1temp1  :  Temp Name
              psu2temp1  :  Temp Name
              temp1      :  Temp Name
              temp2      :  Temp Name
              temp3      :  Temp Name
              temp4      :  Temp Name
              temp5      :  Temp Name
              <ENTER>
          

          This example shows the state of all temperature sensors with the name psu2temp1.

          cumulus@switch:~$ netq show sensors temp psu2temp1
          Matching sensors records:
          Hostname          Name            Description                         State      Temp     Critical Max      Min      Message                             Last Changed
          ----------------- --------------- ----------------------------------- ---------- -------- -------- -------- -------- ----------------------------------- -------------------------
          border01          psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Tue Aug 25 21:45:21 2020
          border02          psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Tue Aug 25 21:39:36 2020
          fw1               psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 00:08:01 2020
          fw2               psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 00:02:13 2020
          leaf01            psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Tue Aug 25 18:30:07 2020
          leaf02            psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Tue Aug 25 18:08:38 2020
          leaf03            psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Tue Aug 25 21:20:34 2020
          leaf04            psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 14:20:22 2020
          spine01           psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 10:53:17 2020
          spine02           psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 10:54:07 2020
          spine03           psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 11:00:44 2020
          spine04           psu2temp1       psu2 temp sensor                    ok         25       85       80       5                                            Wed Aug 26 10:52:00 2020
          

          View Digital Optics Information

          Use the filter option to view laser power and bias current for a given interface and channel on a switch, and temperature and voltage for a given module. Select the relevant tab to view the data.

          1. Click Menu, then click Digital Optics.

          2. The Laser Rx Power tab is displayed by default.

          Laser ParameterDescription
          HostnameName of the switch or host where the digital optics module resides
          TimestampDate and time the data was captured
          If NameName of interface where the digital optics module is installed
          UnitsMeasurement unit for the power (mW) or current (mA)
          Channel 1–8Value of the power or current on each channel where the digital optics module is transmitting
          Module ParameterDescription
          HostnameName of the switch or host where the digital optics module resides
          TimestampDate and time the data was captured
          If NameName of interface where the digital optics module is installed
          Degree CCurrent module temperature, measured in degrees Celsius
          Degree FCurrent module temperature, measured in degrees Fahrenheit
          UnitsMeasurement unit for module voltage; Volts
          ValueCurrent module voltage
          1. Click each of the other Laser or Module tabs to view that information for all devices.

          To view digital optics information for your switches and host servers, run one of the following:

          netq show dom type (laser_rx_power|laser_output_power|laser_bias_current) [interface <text-dom-port-anchor>] [channel_id <text-channel-id>] [around <text-time>] [json]
          netq show dom type (module_temperature|module_voltage) [interface <text-dom-port-anchor>] [around <text-time>] [json]
          

          This example shows module temperature information for all devices.

          cumulus@switch:~$ netq show dom type module_temperature
          Matching dom records:
          Hostname          Interface  type                 high_alarm_threshold low_alarm_threshold  high_warning_thresho low_warning_threshol value                Last Updated
                                                                                                      ld                   d
          ----------------- ---------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- ------------------------
          ...
          spine01           swp53s0    module_temperature   {‘degree_c’: 85,     {‘degree_c’: -10,    {‘degree_c’: 70,     {‘degree_c’: 0,      {‘degree_c’: 32,     Wed Jul  1 15:25:56 2020
                                                            ‘degree_f’: 185}     ‘degree_f’: 14}      ‘degree_f’: 158}     ‘degree_f’: 32}      ‘degree_f’: 89.6}
          spine01           swp35      module_temperature   {‘degree_c’: 75,     {‘degree_c’: -5,     {‘degree_c’: 70,     {‘degree_c’: 0,      {‘degree_c’: 27.82,  Wed Jul  1 15:25:56 2020
                                                            ‘degree_f’: 167}     ‘degree_f’: 23}      ‘degree_f’: 158}     ‘degree_f’: 32}      ‘degree_f’: 82.08}
          spine01           swp55      module_temperature   {‘degree_c’: 75,     {‘degree_c’: -5,     {‘degree_c’: 70,     {‘degree_c’: 0,      {‘degree_c’: 26.29,  Wed Jul  1 15:25:56 2020
                                                            ‘degree_f’: 167}     ‘degree_f’: 23}      ‘degree_f’: 158}     ‘degree_f’: 32}      ‘degree_f’: 79.32}
          spine01           swp9       module_temperature   {‘degree_c’: 78,     {‘degree_c’: -13,    {‘degree_c’: 73,     {‘degree_c’: -8,     {‘degree_c’: 25.57,  Wed Jul  1 15:25:56 2020
                                                            ‘degree_f’: 172.4}   ‘degree_f’: 8.6}     ‘degree_f’: 163.4}   ‘degree_f’: 17.6}    ‘degree_f’: 78.02}
          spine01           swp56      module_temperature   {‘degree_c’: 78,     {‘degree_c’: -10,    {‘degree_c’: 75,     {‘degree_c’: -5,     {‘degree_c’: 29.43,  Wed Jul  1 15:25:56 2020
                                                            ‘degree_f’: 172.4}   ‘degree_f’: 14}      ‘degree_f’: 167}     ‘degree_f’: 23}      ‘degree_f’: 84.97}
          ...
          

          View Software Inventory across the Network

          You can view software components deployed on all switches and hosts, or on all the switches in your network.

          View the Operating Systems Information

          1. Locate the medium Inventory/Devices card on your workbench.
          1. Hover over the pie charts to view the total number of devices with a given operating system installed.
          1. Change to the large card using the size picker.

          2. Hover over a segment in the OS distribution chart to view the total number of devices with a given operating system installed.

            Note that sympathetic highlighting (in blue) is employed to show which versions of the other switch components are associated with this OS.

          1. Click on a segment in OS distribution chart.

          2. Click Filter OS at the top of the popup.

          1. The card updates to show only the components associated with switches running the selected OS. To return to all OSs, click X in the OS tag to remove the filter.
          1. Change to the full-screen card using the size picker.
          1. The All Switches tab is selected by default. Scroll to the right to locate all of the OS parameter data.

          2. Click All Hosts to view the OS parameters for all host servers.

          To view OS information for your switches and host servers, run:

          netq show inventory os [version <os-version>|name <os-name>] [json]
          

          You can filter the results of the command to view only devices with a particular operating system or version. This can be especially helpful when you suspect that a particular device upgrade did not work as expected.

          This example shows all devices with the Cumulus Linux version 3.7.12 installed.

          cumulus@switch:~$ netq show inventory os version 3.7.12
          
          Matching inventory records:
          Hostname          Name            Version                              Last Changed
          ----------------- --------------- ------------------------------------ -------------------------
          spine01           CL              3.7.12                               Mon Aug 10 19:55:06 2020
          spine02           CL              3.7.12                               Mon Aug 10 19:55:07 2020
          spine03           CL              3.7.12                               Mon Aug 10 19:55:09 2020
          spine04           CL              3.7.12                               Mon Aug 10 19:55:08 2020
          

          View the Supported Cumulus Linux Packages

          When you are troubleshooting an issue with a switch, you might want to know all the supported versions of the Cumulus Linux operating system that are available for that switch and on a switch that is not having the same issue.

          To view package information for your switches, run:

          netq show cl-manifest [json]
          

          This example shows the OS packages supported for all switches.

          cumulus@switch:~$ netq show cl-manifest
          
          Matching manifest records:
          Hostname          ASIC Vendor          CPU Arch             Manifest Version
          ----------------- -------------------- -------------------- --------------------
          border01          vx                   x86_64               3.7.6.1
          border01          vx                   x86_64               3.7.10
          border01          vx                   x86_64               3.7.11
          border01          vx                   x86_64               3.6.2.1
          ...
          fw1               vx                   x86_64               3.7.6.1
          fw1               vx                   x86_64               3.7.10
          fw1               vx                   x86_64               3.7.11
          fw1               vx                   x86_64               3.6.2.1
          ...
          leaf01            vx                   x86_64               4.1.0
          leaf01            vx                   x86_64               4.0.0
          leaf01            vx                   x86_64               3.6.2
          leaf01            vx                   x86_64               3.7.2
          ...
          leaf02            vx                   x86_64               3.7.6.1
          leaf02            vx                   x86_64               3.7.10
          leaf02            vx                   x86_64               3.7.11
          leaf02            vx                   x86_64               3.6.2.1
          ...
          

          View All Software Packages Installed

          If you are having an issue with several switches, you should verify all the packages installed on them and compare that to the recommended packages for a given Cumulus Linux release.

          To view installed package information for your switches, run:

          netq show cl-pkg-info [<text-package-name>] [around <text-time>] [json]
          

          Use the text-package-name option to narrow the results to a particular package.

          This example shows the installed switchd package version.

          cumulus@switch:~$ netq spine01 show cl-pkg-info switchd
          
          Matching package_info records:
          Hostname          Package Name             Version              CL Version           Package Status       Last Changed
          ----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
          spine01           switchd                  1.0-cl3u40           Cumulus Linux 3.7.12 installed            Thu Aug 27 01:58:47 2020
          
          

          You can determine whether any of your switches are using a software package other than the default package associated with the Cumulus Linux release that is running on the switches. Use this list to determine which packages to install/upgrade on all devices. Additionally, you can determine if a software package is missing.

          To view recommended package information for your switches, run:

          netq show recommended-pkg-version [release-id <text-release-id>] [package-name <text-package-name>] [json]
          

          The output can be rather lengthy if you run this command for all releases and packages. If desired, run the command using the release-id and/or package-name options to shorten the output.

          This example looks for switches running Cumulus Linux 3.7.1 and switchd. The result is a single switch, leaf12, that has older software and should get an update.

          cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name switchd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          leaf12            3.7.1                vx                   x86_64               switchd              1.0-cl3u30           Wed Feb  5 04:36:30 2020
          

          This example looks for switches running Cumulus Linux 3.7.1 and ptmd. The result is a single switch, server01, that has older software and should get an update.

          cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name ptmd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          server01            3.7.1                vx                   x86_64               ptmd                 3.0-2-cl3u8          Wed Feb  5 04:36:30 2020
          

          This example looks for switches running Cumulus Linux 3.7.1 and lldpd. The result is a single switch, server01, that has older software and should get an update.

          cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name lldpd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          server01            3.7.1                vx                   x86_64               lldpd                0.9.8-0-cl3u11       Wed Feb  5 04:36:30 2020
          

          This example looks for switches running Cumulus Linux 3.6.2 and switchd. The result is a single switch, leaf04, that has older software and should get an update.

          cumulus@noc-pr:~$ netq show recommended-pkg-version release-id 3.6.2 package-name switchd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          leaf04            3.6.2                vx                   x86_64               switchd              1.0-cl3u27           Wed Feb  5 04:36:30 2020
          

          View ACL Resources

          Using the NetQ CLI, you can monitor the incoming and outgoing access control lists (ACLs) configured on all switches, currently or at a time in the past.

          To view ACL resources for all your switches, run:

          netq show cl-resource acl [ingress | egress] [around <text-time>] [json]
          

          Use the egress or ingress options to show only the outgoing or incoming ACLs.

          This example shows the ACL resources for all configured switches:

          cumulus@switch:~$ netq show cl-resource acl
          Matching cl_resource records:
          Hostname          In IPv4 filter       In IPv4 Mangle       In IPv6 filter       In IPv6 Mangle       In 8021x filter      In Mirror            In PBR IPv4 filter   In PBR IPv6 filter   Eg IPv4 filter       Eg IPv4 Mangle       Eg IPv6 filter       Eg IPv6 Mangle       ACL Regions          18B Rules Key        32B Rules Key        54B Rules Key        L4 Port range Checke Last Updated
                                                                                                                                                                                                                                                                                                                                                                            rs
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- ------------------------
          act-5712-09       40,512(7%)           0,0(0%)              30,768(3%)           0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              32,256(12%)          0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              2,24(8%)             Tue Aug 18 20:20:39 2020
          mlx-2700-04       0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              0,0(0%)              4,400(1%)            2,2256(0%)           0,1024(0%)           2,1024(0%)           0,0(0%)              Tue Aug 18 20:19:08 2020
          

          View Forwarding Resources

          To view forwarding resources for all your switches, run:

          netq show cl-resource forwarding [around <text-time>] [json]
          

          This example shows forwarding resources for all configured switches:

          cumulus@noc-pr:~$ netq show cl-resource forwarding
          Matching cl_resource records:
          Hostname          IPv4 host entries    IPv6 host entries    IPv4 route entries   IPv6 route entries   ECMP nexthops        MAC entries          Total Mcast Routes   Last Updated
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- ------------------------
          act-5712-09       0,16384(0%)          0,0(0%)              0,131072(0%)         23,20480(0%)         0,16330(0%)          0,32768(0%)          0,8192(0%)           Tue Aug 18 20:20:39 2020
          mlx-2700-04       0,32768(0%)          0,16384(0%)          0,65536(0%)          4,28672(0%)          0,4101(0%)           0,40960(0%)          0,1000(0%)           Tue Aug 18 20:19:08 2020
          

          View NetQ Agents

          To view the NetQ Agents on all switches and hosts:

          1. Click Menu.

          2. Select Agents from the Network column.

          3. View the Version column to determine which release of the NetQ Agent is running on your devices. Ideally, this version should be the same as the NetQ release you are running, and is the same across all your devices.

          ParameterDescription
          HostnameName of the switch or host
          TimestampDate and time the data was captured
          Last ReinitDate and time that the switch or host was reinitialized
          Last Update TimeDate and time that the switch or host was updated
          LastbootDate and time that the switch or host was last booted up
          NTP StateStatus of NTP synchronization on the switch or host; yes = in synchronization, no = out of synchronization
          Sys UptimeAmount of time the switch or host has been continuously up and running
          VersionNetQ version running on the switch or host

          To view the NetQ Agents on all switches and hosts, run:

          netq show agents [fresh | rotten ] [around <text-time>] [json]
          

          Use the fresh keyword to view only the NetQ Agents that are in current communication with the NetQ Platform or NetQ Collector. Use the rotten keyword to view those that are not.

          Switch Inventory

          With the NetQ UI and NetQ CLI, you can monitor your inventory of switches across the network or individually. A user can view operating system, motherboard, ASIC, microprocessor, disk, memory, fan, and power supply information. This information is relevant for upgrades, compliance, and other planning tasks.

          For switch performance information, refer to Monitor Switches.

          Access Switch Inventory Data

          Add the Inventory/Switches card to your workbench to monitor the hardware and software component inventory on switches running NetQ in your network. Select the dropdown to view additional inventory information.

              

          The CLI provides detailed switch inventory information through its netq <hostname> show inventory command.

          View Switch Inventory Summary

          View the Number of Types of Any Component Deployed

          For each of the components monitored on a switch, NetQ displays a unique count.

          To view this count for all of the components on the switch:

          1. Open the large Switch Inventory card.
          1. Note the number in the Unique column for each component.

          By default, the card displays data for fresh switches. Select Rotten switches from the dropdown to display information for switches that are in a down state. Hover over any of the segments in the distribution chart to highlight a specific component.

          When you hover, a tooltip appears displaying:

          • Name or value of the component type, such as the version number or status
          • Total number of switches with that type of component deployed compared to the total number of switches
          • Percentage of this type with respect to all component types

          To view the hardware and software components for a switch, run:

          netq <hostname> show inventory brief
          

          This example shows the type of switch (Cumulus VX), operating system (Cumulus Linux), CPU (x86_62), and ASIC (virtual) for the spine01 switch.

          cumulus@switch:~$ netq spine01 show inventory brief
          Matching inventory records:
          Hostname          Switch               OS              CPU      ASIC            Ports
          ----------------- -------------------- --------------- -------- --------------- -----------------------------------
          spine01           VX                   CL              x86_64   VX              N/A
          

          This example show the components on the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory brief opta
          Matching inventory records:
          Hostname          Switch               OS              CPU      ASIC            Ports
          ----------------- -------------------- --------------- -------- --------------- -----------------------------------
          netq-ts           N/A                  Ubuntu          x86_64   N/A             N/A
          

          View Switch Hardware Inventory

          You can view hardware components deployed on each switch in your network.

          View ASIC Information for a Switch

          You can view the ASIC information for a switch from either the NetQ CLI or NetQ UI.

          1. Locate the medium Inventory/Switches card on your workbench.

          2. Change to the full-screen card and click ASIC.

          Note that if you are running CumulusVX switches, no detailed ASIC information is available because the hardware is virtualized.
          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Select hostname from the Field dropdown.

          3. Enter the hostname of the switch you want to view, and click Apply.

          To view information about the ASIC on a switch, run:

          netq [<hostname>] show inventory asic [opta] [json]
          

          This example shows the ASIC information for the leaf02 switch.

          cumulus@switch:~$ netq leaf02 show inventory asic
          Matching inventory records:
          Hostname          Vendor               Model                          Model ID                  Core BW        Ports
          ----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
          leaf02            Mellanox             Spectrum                       MT52132                   N/A            32 x 100G-QSFP28
          

          This example shows the ASIC information for the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory asic opta
          Matching inventory records:
          Hostname          Vendor               Model                          Model ID                  Core BW        Ports
          ----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
          netq-ts            Mellanox             Spectrum                       MT52132                   N/A            32 x 100G-QSFP28
          

          View Motherboard Information for a Switch

          1. Locate the medium Inventory/Switches card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. Click Platform.

          Note that if you are running CumulusVX switches, no detailed platform information is available because the hardware is virtualized.
          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Select hostname from the Field dropdown.

          3. Enter the hostname of the switch you want to view, and click Apply.

          To view a list of motherboards installed in a switch, run:

          netq [<hostname>] show inventory board [opta] [json]
          

          This example shows all motherboard data for the spine01 switch.

          cumulus@switch:~$ netq spine01 show inventory board
          Matching inventory records:
          Hostname          Vendor               Model                          Base MAC           Serial No                 Part No          Rev    Mfg Date
          ----------------- -------------------- ------------------------------ ------------------ ------------------------- ---------------- ------ ----------
          spine01           Dell                 S6000-ON                       44:38:39:00:80:00  N/A                       N/A              N/A    N/A
          

          Use the opta option without the hostname option to view the motherboard data for the NetQ On-premises or Cloud Appliance. No motherboard data is available for NetQ On-premises or Cloud VMs.

          View CPU Information for a Switch

          1. Locate the Inventory/Switches card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. Click CPU.

          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.

          To view CPU information for a switch in your network, run:

          netq [<hostname>] show inventory cpu [arch <cpu-arch>] [opta] [json]
          

          This example shows CPU information for the server02 switch.

          cumulus@switch:~$ netq server02 show inventory cpu
          Matching inventory records:
          Hostname          Arch     Model                          Freq       Cores
          ----------------- -------- ------------------------------ ---------- -----
          server02          x86_64   Intel Core i7 9xx (Nehalem Cla N/A        1
                                      ss Core i7)
          

          This example shows the CPU information for the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory cpu opta
          Matching inventory records:
          Hostname          Arch     Model                          Freq       Cores
          ----------------- -------- ------------------------------ ---------- -----
          netq-ts           x86_64   Intel Xeon Processor (Skylake, N/A        8
                                     IBRS)
          

          View Disk Information for a Switch

          1. Locate the Inventory/Switches card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. Click Disk.

          Note that if you are running CumulusVX switches, no detailed disk information is available because the hardware is virtualized.
          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.

          To view disk information for a switch in your network, run:

          netq [<hostname>] show inventory disk [opta] [json]
          

          This example shows the disk information for the leaf03 switch.

          cumulus@switch:~$ netq leaf03 show inventory disk
          Matching inventory records:
          Hostname          Name            Type             Transport          Size       Vendor               Model
          ----------------- --------------- ---------------- ------------------ ---------- -------------------- ------------------------------
          leaf03            vda             disk             N/A                6G         0x1af4               N/A
          

          This example show the disk information for the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory disk opta
          
          Matching inventory records:
          Hostname          Name            Type             Transport          Size       Vendor               Model
          ----------------- --------------- ---------------- ------------------ ---------- -------------------- ------------------------------
          netq-ts           vda             disk             N/A                265G       0x1af4               N/A
          

          View Memory Information for a Switch

          Memory information is available from the NetQ UI and NetQ CLI.

          1. Locate the medium Inventory/Switches card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. Click Memory.

          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.

          To view memory information for your switches and host servers, run:

          netq [<hostname>] show inventory memory [opta] [json]
          

          This example shows all the memory characteristics for the leaf01 switch.

          cumulus@switch:~$ netq leaf01 show inventory memory
          Matching inventory records:
          Hostname          Name            Type             Size       Speed      Vendor               Serial No
          ----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
          leaf01            DIMM 0          RAM              768 MB     Unknown    QEMU                 Not Specified
          
          

          This example shows the memory information for the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory memory opta
          Matching inventory records:
          Hostname          Name            Type             Size       Speed      Vendor               Serial No
          ----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
          netq-ts           DIMM 0          RAM              16384 MB   Unknown    QEMU                 Not Specified
          netq-ts           DIMM 1          RAM              16384 MB   Unknown    QEMU                 Not Specified
          netq-ts           DIMM 2          RAM              16384 MB   Unknown    QEMU                 Not Specified
          netq-ts           DIMM 3          RAM              16384 MB   Unknown    QEMU                 Not Specified
          

          View Switch Software Inventory

          View Operating System Information for a Switch

          1. Locate the Inventory/Switches card on your workbench.

          2. Hover over the card, and change to the full-screen card using the size picker.

          3. Click OS.

          1. Click to quickly locate a switch that does not appear on the first page of the switch list.

          2. Enter a hostname, then click Apply.

          To view OS information for a switch, run:

          netq [<hostname>] show inventory os [opta] [json]
          

          This example shows the OS information for the leaf02 switch.

          cumulus@switch:~$ netq leaf02 show inventory os
          Matching inventory records:
          Hostname          Name            Version                              Last Changed
          ----------------- --------------- ------------------------------------ -------------------------
          leaf02            CL              3.7.5                                Fri Apr 19 16:01:46 2019
          

          This example shows the OS information for the NetQ On-premises or Cloud Appliance.

          cumulus@switch:~$ netq show inventory os opta
          
          Matching inventory records:
          Hostname          Name            Version                              Last Changed
          ----------------- --------------- ------------------------------------ -------------------------
          netq-ts           Ubuntu          18.04                                Tue Jul 14 19:27:39 2020
          

          View the Cumulus Linux Packages on a Switch

          When you are troubleshooting an issue with a switch, you might want to know which supported versions of the Cumulus Linux operating system are available for that switch and on a switch that is not having the same issue.

          To view package information for your switches, run:

          netq <hostname> show cl-manifest [json]
          

          This example shows the Cumulus Linux OS versions supported for the leaf01 switch, using the vx ASIC vendor (virtual, so simulated) and x86_64 CPU architecture.

          cumulus@switch:~$ netq leaf01 show cl-manifest
          
          Matching manifest records:
          Hostname          ASIC Vendor          CPU Arch             Manifest Version
          ----------------- -------------------- -------------------- --------------------
          leaf01            vx                   x86_64               3.7.6.1
          leaf01            vx                   x86_64               3.7.10
          leaf01            vx                   x86_64               3.6.2.1
          leaf01            vx                   x86_64               3.7.4
          leaf01            vx                   x86_64               3.7.2.5
          leaf01            vx                   x86_64               3.7.1
          leaf01            vx                   x86_64               3.6.0
          leaf01            vx                   x86_64               3.7.0
          leaf01            vx                   x86_64               3.4.1
          leaf01            vx                   x86_64               3.7.3
          leaf01            vx                   x86_64               3.2.0
          ...
          

          View All Software Packages Installed on Switches

          If you are having an issue with a particular switch, you should verify all the installed software and whether it needs updating.

          To view package information for a switch, run:

          netq <hostname> show cl-pkg-info [<text-package-name>] [around <text-time>] [json]
          

          Use the text-package-name option to narrow the results to a particular package or the around option to narrow the output to a particular time range.

          This example shows all installed software packages for spine01.

          cumulus@switch:~$ netq spine01 show cl-pkg-info
          Matching package_info records:
          Hostname          Package Name             Version              CL Version           Package Status       Last Changed
          ----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
          spine01           libfile-fnmatch-perl     0.02-2+b1            Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           screen                   4.2.1-3+deb8u1       Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           libudev1                 215-17+deb8u13       Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           libjson-c2               0.11-4               Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           atftp                    0.7.git20120829-1+de Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
                                                     b8u1
          spine01           isc-dhcp-relay           4.3.1-6-cl3u14       Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           iputils-ping             3:20121221-5+b2      Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           base-files               8+deb8u11            Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           libx11-data              2:1.6.2-3+deb8u2     Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           onie-tools               3.2-cl3u6            Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           python-cumulus-restapi   0.1-cl3u10           Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           tasksel                  3.31+deb8u1          Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           ncurses-base             5.9+20140913-1+deb8u Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
                                                     3
          spine01           libmnl0                  1.0.3-5-cl3u2        Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          spine01           xz-utils                 5.1.1alpha+20120614- Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          ...
          

          This example shows the ntp package on the spine01 switch.

          cumulus@switch:~$ netq spine01 show cl-pkg-info ntp
          Matching package_info records:
          Hostname          Package Name             Version              CL Version           Package Status       Last Changed
          ----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
          spine01           ntp                      1:4.2.8p10-cl3u2     Cumulus Linux 3.7.12 installed            Wed Aug 26 19:58:45 2020
          

          If you have a software manifest, you can determine the recommended packages and versions for a particular Cumulus Linux release. You can then compare that to the software already installed on your switch(es) to determine if it differs from the manifest. Such a difference might occur if you upgraded one or more packages separately from the Cumulus Linux software itself.

          To view recommended package information for a switch, run:

          netq <hostname> show recommended-pkg-version [release-id <text-release-id>] [package-name <text-package-name>] [json]
          

          This example shows the recommended packages for upgrading the leaf12 switch, namely switchd.

          cumulus@switch:~$ netq leaf12 show recommended-pkg-version
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          leaf12            3.7.1                vx                   x86_64               switchd              1.0-cl3u30           Wed Feb  5 04:36:30 2020
          

          This example shows the recommended packages for upgrading the server01 switch, namely lldpd.

          cumulus@switch:~$ netq server01 show recommended-pkg-version
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          server01            3.7.1                vx                   x86_64               lldpd                0.9.8-0-cl3u11       Wed Feb  5 04:36:30 2020
          

          This example shows the recommended version of the switchd package for use with Cumulus Linux 3.7.2.

          cumulus@switch:~$ netq act-5712-09 show recommended-pkg-version release-id 3.7.2 package-name switchd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          act-5712-09       3.7.2                bcm                  x86_64               switchd              1.0-cl3u31           Wed Feb  5 04:36:30 2020
          

          This example shows the recommended version of the switchd package for use with Cumulus Linux 3.1.0. Note the version difference from the example for Cumulus Linux 3.7.2.

          cumulus@noc-pr:~$ netq act-5712-09 show recommended-pkg-version release-id 3.1.0 package-name switchd
          Matching manifest records:
          Hostname          Release ID           ASIC Vendor          CPU Arch             Package Name         Version              Last Changed
          ----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
          act-5712-09       3.1.0                bcm                  x86_64               switchd              1.0-cl3u4            Wed Feb  5 04:36:30 2020
          

          Validate NetQ Agents are Running

          You can confirm that NetQ Agents are running on switches and hosts (if installed) using the netq show agents command. The Status indicates whether the agent is up and current, labelled Fresh, or down and stale, labelled Rotten. Additional information includes the agent status — whether it is time synchronized, how long it has been up, and the last time its state changed.

          This example shows NetQ Agent state on all devices.

          cumulus@switch:~$ netq show agents
          Matching agents records:
          Hostname          Status           NTP Sync Version                              Sys Uptime                Agent Uptime              Reinitialize Time          Last Changed
          ----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
          border01          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:54 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:38 2020
          border02          Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:57 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:33 2020
          fw1               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:44 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:26 2020
          fw2               Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:04:42 2020  Tue Sep 29 21:24:48 2020  Tue Sep 29 21:24:48 2020   Thu Oct  1 16:07:22 2020
          leaf01            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 16:49:04 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:10 2020
          leaf02            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:14 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:30 2020
          leaf03            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:37 2020  Tue Sep 29 21:24:49 2020  Tue Sep 29 21:24:49 2020   Thu Oct  1 16:07:24 2020
          leaf04            Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:35 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:13 2020
          oob-mgmt-server   Fresh            yes      3.1.1-ub18.04u29~1599111022.78b9e43  Mon Sep 21 16:43:58 2020  Mon Sep 21 17:55:00 2020  Mon Sep 21 17:55:00 2020   Thu Oct  1 16:07:31 2020
          server01          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:16 2020
          server02          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:24 2020
          server03          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:56 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:12 2020
          server04          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:07 2020  Tue Sep 29 21:13:07 2020   Thu Oct  1 16:07:17 2020
          server05          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:25 2020
          server06          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:19:57 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:21 2020
          server07          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:48 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:28 2020
          server08          Fresh            yes      3.2.0-ub18.04u30~1601393774.104fb9e  Mon Sep 21 17:06:45 2020  Tue Sep 29 21:13:10 2020  Tue Sep 29 21:13:10 2020   Thu Oct  1 16:07:31 2020
          spine01           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:20 2020
          spine02           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:33 2020  Tue Sep 29 21:24:58 2020  Tue Sep 29 21:24:58 2020   Thu Oct  1 16:07:16 2020
          spine03           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:34 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:20 2020
          spine04           Fresh            yes      3.2.0-cl4u30~1601410518.104fb9ed     Mon Sep 21 17:03:32 2020  Tue Sep 29 21:25:07 2020  Tue Sep 29 21:25:07 2020   Thu Oct  1 16:07:33 2020
          

          You can narrow your focus in several ways:

          Monitor Software Services

          Cumulus Linux, SONiC, and NetQ run many services to deliver the various features of these products. You can monitor their status using the netq show services command. This section describes services related to system-level operation. For monitoring other services, such as those related to routing, see those topics. NetQ automatically monitors the following services:

          The CLI syntax for viewing the status of services is:

          netq [<hostname>] show services [<service-name>] [vrf <vrf>] [active|monitored] [around <text-time>] [json]
          netq [<hostname>] show services [<service-name>] [vrf <vrf>] status (ok|warning|error|fail) [around <text-time>] [json]
          netq [<hostname>] show events [severity info | severity error ] message_type services [between <text-time> and <text-endtime>] [json]
          

          View All Services on All Devices

          This example shows all available services on each device and whether each is enabled, active, and monitored, along with how long the service has been running and the last time it changed.

          It is useful to have colored output for this show command. To configure colored output, run the netq config add color command.

          cumulus@switch:~$ netq show services
          Hostname          Service              PID   VRF             Enabled Active Monitored Status           Uptime                    Last Changed
          ----------------- -------------------- ----- --------------- ------- ------ --------- ---------------- ------------------------- -------------------------
          leaf01            bgpd                 2872  default         yes     yes    yes       ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            clagd                n/a   default         yes     no     yes       n/a              1d:6h:43m:35s             Fri Feb 15 17:28:48 2019
          leaf01            ledmgrd              1850  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            lldpd                2651  default         yes     yes    yes       ok               1d:6h:43m:27s             Fri Feb 15 17:28:56 2019
          leaf01            mstpd                1746  default         yes     yes    yes       ok               1d:6h:43m:35s             Fri Feb 15 17:28:48 2019
          leaf01            neighmgrd            1986  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            netq-agent           8654  mgmt            yes     yes    yes       ok               1d:6h:43m:29s             Fri Feb 15 17:28:54 2019
          leaf01            netqd                8848  mgmt            yes     yes    yes       ok               1d:6h:43m:29s             Fri Feb 15 17:28:54 2019
          leaf01            ntp                  8478  mgmt            yes     yes    yes       ok               1d:6h:43m:29s             Fri Feb 15 17:28:54 2019
          leaf01            ptmd                 2743  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            pwmd                 1852  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            smond                1826  default         yes     yes    yes       ok               1d:6h:43m:27s             Fri Feb 15 17:28:56 2019
          leaf01            ssh                  2106  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            syslog               8254  default         yes     yes    no        ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf01            zebra                2856  default         yes     yes    yes       ok               1d:6h:43m:59s             Fri Feb 15 17:28:24 2019
          leaf02            bgpd                 2867  default         yes     yes    yes       ok               1d:6h:43m:55s             Fri Feb 15 17:28:28 2019
          leaf02            clagd                n/a   default         yes     no     yes       n/a              1d:6h:43m:31s             Fri Feb 15 17:28:53 2019
          leaf02            ledmgrd              1856  default         yes     yes    no        ok               1d:6h:43m:55s             Fri Feb 15 17:28:28 2019
          leaf02            lldpd                2646  default         yes     yes    yes       ok               1d:6h:43m:30s             Fri Feb 15 17:28:53 2019
          ...
          

          If you want to view the service information for a given device, use the hostname option when running the command.

          View Information about a Given Service on All Devices

          You can view the status of a given service at the current time, at a prior point in time, or view the changes that have occurred for the service during a specified timeframe.

          This example shows how to view the status of the NTP service across the network. In this case, the VRF configuration has the NTP service running on both the default and management interface. You can perform the same command with the other services, such as bgpd, lldpd, and clagd.

          cumulus@switch:~$ netq show services ntp
          Matching services records:
          Hostname          Service              PID   VRF             Enabled Active Monitored Status           Uptime                    Last Changed
          ----------------- -------------------- ----- --------------- ------- ------ --------- ---------------- ------------------------- -------------------------
          exit01            ntp                  8478  mgmt            yes     yes    yes       ok               1d:6h:52m:41s             Fri Feb 15 17:28:54 2019
          exit02            ntp                  8497  mgmt            yes     yes    yes       ok               1d:6h:52m:36s             Fri Feb 15 17:28:59 2019
          firewall01        ntp                  n/a   default         yes     yes    yes       ok               1d:6h:53m:4s              Fri Feb 15 17:28:31 2019
          hostd-11          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:46s             Fri Feb 15 17:28:49 2019
          hostd-21          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:37s             Fri Feb 15 17:28:58 2019
          hosts-11          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:28s             Fri Feb 15 17:29:07 2019
          hosts-13          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:19s             Fri Feb 15 17:29:16 2019
          hosts-21          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:14s             Fri Feb 15 17:29:21 2019
          hosts-23          ntp                  n/a   default         yes     yes    yes       ok               1d:6h:52m:4s              Fri Feb 15 17:29:31 2019
          noc-pr            ntp                  2148  default         yes     yes    yes       ok               1d:6h:53m:43s             Fri Feb 15 17:27:52 2019
          noc-se            ntp                  2148  default         yes     yes    yes       ok               1d:6h:53m:38s             Fri Feb 15 17:27:57 2019
          spine01           ntp                  8414  mgmt            yes     yes    yes       ok               1d:6h:53m:30s             Fri Feb 15 17:28:05 2019
          spine02           ntp                  8419  mgmt            yes     yes    yes       ok               1d:6h:53m:27s             Fri Feb 15 17:28:08 2019
          spine03           ntp                  8443  mgmt            yes     yes    yes       ok               1d:6h:53m:22s             Fri Feb 15 17:28:13 2019
          leaf01             ntp                  8765  mgmt            yes     yes    yes       ok               1d:6h:52m:52s             Fri Feb 15 17:28:43 2019
          leaf02             ntp                  8737  mgmt            yes     yes    yes       ok               1d:6h:52m:46s             Fri Feb 15 17:28:49 2019
          leaf11            ntp                  9305  mgmt            yes     yes    yes       ok               1d:6h:49m:22s             Fri Feb 15 17:32:13 2019
          leaf12            ntp                  9339  mgmt            yes     yes    yes       ok               1d:6h:49m:9s              Fri Feb 15 17:32:26 2019
          leaf21            ntp                  9367  mgmt            yes     yes    yes       ok               1d:6h:49m:5s              Fri Feb 15 17:32:30 2019
          leaf22            ntp                  9403  mgmt            yes     yes    yes       ok               1d:6h:52m:57s             Fri Feb 15 17:28:38 2019
          

          To view changes over a given time period, use the netq show events command. For more detailed information about events, refer to Events and Notifications.

          This example shows changes to the bgpd service in the last 48 hours.

          cumulus@switch:/$ netq show events message_type bgp between now and 48h
          Matching events records:
          Hostname          Message Type Severity Message                             Timestamp
          ----------------- ------------ -------- ----------------------------------- -------------------------
          leaf01            bgp          info     BGP session with peer spine-1 swp3. 1d:6h:55m:37s
                                                  3 vrf DataVrf1081 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-2 swp4. 1d:6h:55m:37s
                                                  3 vrf DataVrf1081 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-3 swp5. 1d:6h:55m:37s
                                                  3 vrf DataVrf1081 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-1 swp3. 1d:6h:55m:37s
                                                  2 vrf DataVrf1080 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-3 swp5. 1d:6h:55m:37s
                                                  2 vrf DataVrf1080 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-2 swp4. 1d:6h:55m:37s
                                                  2 vrf DataVrf1080 state changed fro
                                                  m failed to Established
          leaf01            bgp          info     BGP session with peer spine-3 swp5. 1d:6h:55m:37s
                                                  4 vrf DataVrf1082 state changed fro
                                                  m failed to Established
          

          Host Inventory

          In the UI, you can view your inventory of hosts across the network or individually, including a host’s operating system, ASIC, CPU model, disk, platform, and memory information. This information can help with upgrades, compliance, and other planning tasks.

          Access and View Host Inventory Data

          The Inventory/Hosts card monitors the hardware- and software-component inventory on hosts running NetQ in your network. Access this card from the NetQ Workbench, or add it to your own workbench by clicking (Add card) > Inventory > Inventory/Hosts card > Open Cards.

          host inventory card with chart

          Hover over the chart in the default card view to view component details. To view the distribution of components, hover over the card header and increase the card’s size. Select the corresponding icon to view a detailed chart for ASIC, platform, or software components:

          medium host inventory card displaying component distribution

          To display detailed information as a table, expand the card to its largest size:

          fully expanded host inventory card displaying a table with data

          To monitor host hardware resource utilization, see Monitor Hosts.

          DPU Inventory

          DPU monitoring is an early access feature.

          In the UI, you can view your DPU inventory across the network or individually, including a DPU’s operating system, ASIC, CPU model, disk, platform, and memory information. This information can help with upgrades, compliance, and other planning tasks.

          Access and View DPU Inventory Data

          The Inventory/DPU card displays the hardware- and software-component inventory on DPUs running NetQ in your network.

          DPU inventory card with chart

          Hover over the chart in the default card view to view component details. To view the distribution of components, hover over the card header and increase the card’s size. Select the corresponding icon to view a detailed chart for ASIC, platform, or software components:

          medium DPU inventory card displaying component distribution

          To display detailed information as a table, expand the card to its largest size:

          fully expanded DPU inventory card displaying a table with data

          To monitor DPU hardware resource utilization, see Monitor DPUs.

          To read more about NVIDIA BlueField DPUs and the DOCA Telemetry Service, refer to the DOCA SDK Documentation.

          Device Groups

          Device groups allow you to create a label for a subset of devices in the inventory. You can configure validation checks to run on select devices by referencing group names.

          Create a Device Group

          To create a device group, add the Device Groups card to your workbench. Click to navigate to the Device Groups section and click Open Cards after selecting the Device groups card:

          The Device groups card will now be displayed on your workbench. Click Create New Group to create a new device group:

          The Create New Group wizard will be displayed. To finish creating a new group:

          1. Set the name of the group of devices.

          2. Declare a hostname-based rule to define which devices in the inventory should be added to the group.

          3. Confirm the expected matched devices appear in the inventory, and click Create device group.

          The following example shows a group name of “exit group” matching any device in the inventory with “exit” in the hostname:

          Updating a Device Group

          When new devices that match existing group rules are added to the inventory, those devices matching the rule criteria will be flagged for review to be added to the group inventory. The following example shows the switch “exit-2” being detected in the inventory after the group was already configured:

          To add the new device to the group inventory, click and then click Update device group.

          Removing a Device Group

          To delete a device group:

          1. Expand the Device Groups card:
          1. Click on the desired group and select Delete.

          Events and Notifications

          Events provide information about how a network and its devices are operating during a given time period. Event notifications are available through Slack, PagerDuty, syslog, and email channels to aid troubleshooting and help resolve network problems before they become critical.

          NetQ captures three types of events:

          You can track events in the NetQ UI with the Events and WJH cards:

          The NetQ CLI provides the netq show events command to view system and TCA events for a given time frame. The netq show wjh-drop command lists all WJH events or those with a selected drop type.

          Configure System Event Notifications

          To receive the event messages generated and processed by NetQ, you must integrate a third-party event notification application into your workflow. You can integrate NetQ with Syslog, PagerDuty, Slack, and/or email. Alternately, you can send notifications to other third-party applications via a generic webhook channel.

          In an on-premises deployment, the NetQ On-premises Appliance or VM receives the raw data stream from the NetQ Agents, processes the data, then stores and delivers events to the Notification function. The Notification function filters and sends messages to any configured notification applications. In a cloud deployment, the NetQ Cloud Appliance or VM passes the raw data stream to the NetQ Cloud service for processing and delivery.

          You can implement a proxy server (that sits between the NetQ Appliance or VM and the integration channels) that receives, processes, and distributes the notifications rather than having them sent directly to the integration channel. If you use such a proxy, you must configure NetQ with the proxy information.

          Notifications are generated for the following types of events:

          CategoryEvents
          Network Protocol Validations
          • BGP status and session state
          • MLAG (CLAG) status and session state
          • EVPN status and session state
          • LLDP status
          • OSPF status and session state
          • VLAN status and session state
          • VXLAN status and session state
          Interfaces
          • Link status
          • Ports and cables status
          • MTU status
          Services
          • NetQ Agent status
          • PTM
          • SSH *
          • NTP status
          Traces
          • On-demand trace status
          • Scheduled trace status
          Sensors
          • Fan status
          • PSU (power supply unit) status
          • Temperature status
          System Software
          • Configuration File changes
          • Running Configuration File changes
          • Cumulus Linux Support status
          • Software Package status
          • Operating System version
          • Lifecycle Management status
          System Hardware
          • Physical resources status
          • BTRFS status
          • SSD utilization status

          * CLI only

          Event filters are based on rules you create. You must have at least one rule per filter. A select set of events can be triggered by a user-configured threshold. Refer to the System Event Messages Reference for descriptions and examples of these events.

          Event Message Format

          Messages have the following structure: <message-type><timestamp><opid><hostname><severity><message>

          ElementDescription
          message typeCategory of event; agent, bgp, clag, clsupport, configdiff, evpn, link, lldp, mtu, node, ntp, ospf, packageinfo, ptm, resource, runningconfigdiff, sensor, services, ssdutil, tca, trace, version, vlan or vxlan
          timestampDate and time event occurred
          opidIdentifier of the service or process that generated the event
          hostnameHostname of network device where event occurred
          severitySeverity level in which the given event is classified: error or info
          messageText description of event

          For example:

          To set up the integrations, you must configure NetQ with at least one channel, one rule, and one filter. To refine what messages you want to view and where to send them, you can add additional rules and filters and set thresholds on supported event types. You can also configure a proxy server to receive, process, and forward the messages. This is accomplished using the NetQ UI and NetQ CLI in the following order:

          Configure Basic NetQ Event Notifications

          The simplest configuration you can create is one that sends all events generated by all interfaces to a single notification application. This is described here. For more granular configurations and examples, refer to Configure Advanced NetQ Event Notifications.

          A notification configuration must contain one channel, one rule, and one filter. Creation of the configuration follows this same path:

          1. Add a channel.
          2. Add a rule that accepts a selected set of events.
          3. Add a filter that associates this rule with the newly created channel.

          Create a Channel

          The first step is to create a PagerDuty, Slack, syslog, or email channel to receive the notifications.

          You can use the NetQ UI or the NetQ CLI to create a Slack channel.

          1. Expand the Menu and select Notification Channels.

          2. The Slack tab is displayed by default.

          3. Add a channel.

            • When no channels have been specified, click Add Slack Channel.
            • When at least one channel has been specified, click above the table.
          4. Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.

          1. Create an incoming webhook as described in the Slack documentation Then copy and paste it in the Webhook URL field.

          2. Click Add.

          3. (Optional) To verify the channel configuration, click Test.

          To create and verify a Slack channel, run:

          netq add notification channel slack <text-channel-name> webhook <text-webhook-url> [severity info|severity error] [tag <text-slack-tag>]
          netq show notification channel [json]
          

          This example shows the creation of a slk-netq-events channel and verifies the configuration.

          1. Create an incoming webhook as described in the documentation for your version of Slack.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
            Successfully added/updated channel slk-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity Channel Info
            --------------- ---------------- -------- ----------------------
            slk-netq-events slack            info     webhook:https://hooks.s
                                                        lack.com/services/text/
                                                        moretext/evenmoretext
            

          You can use the NetQ UI or the NetQ CLI to create a PagerDuty channel.

          1. Expand the Menu and select Notification Channels.

          2. Click PagerDuty.

          3. Add a channel.

            • When no channels have been specified, click Add PagerDuty Channel.
            • When at least one channel has been specified, click above the table.
          4. Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.

          1. Obtain and enter an integration key (also called a service key or routing key).

          2. Click Add.

          3. (Optional) To verify the channel configuration, click Test.

          To create and verify a PagerDuty channel, run:

          netq add notification channel pagerduty <text-channel-name> integration-key <text-integration-key> [severity info|severity error]
          netq show notification channel [json]
          

          This example shows the creation of a pd-netq-events channel and verifies the configuration.

          1. Obtain an integration key as described in this PagerDuty support page.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key c6d666e210a8425298ef7abde0d1998
            Successfully added/updated channel pd-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            pd-netq-events  pagerduty        info             integration-key: c6d666e
                                                            210a8425298ef7abde0d1998
            

          You can use the NetQ UI or the NetQ CLI to create a syslog channel.

          1. Expand the Menu and select Notification Channels.

          2. Click Syslog.

          3. Add a channel.

            • When no channels have been specified, click Add Syslog Channel.
            • When at least one channel has been specified, click above the table.
          4. Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.

          1. Enter the IP address and port of the syslog server.

          2. Click Add.

          3. (Optional) To verify the channel configuration, click Test.

          To create and verify a syslog channel, run:

          netq add notification channel syslog <text-channel-name> hostname <text-syslog-hostname> port <text-syslog-port> [severity info | severity error ]
          netq show notification channel [json]
          

          This example shows the creation of a syslog-netq-events channel and verifies the configuration.

          1. Obtain the syslog server hostname (or IP address) and port.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel syslog syslog-netq-events hostname syslog-server port 514
            Successfully added/updated channel syslog-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity Channel Info
            --------------- ---------------- -------- ----------------------
            syslog-netq-eve syslog            info     host:syslog-server
            nts                                        port: 514
            

          You can use the NetQ UI or the NetQ CLI to create an email channel.

          1. Expand the Menu and select Notification Channels.

          2. Click Email.

          3. Add a channel.

            • When no channels have been specified, click Add Email Channel.
            • When at least one channel has been specified, click above the table.
          4. Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.

          1. Enter a list of emails for the people who you want to receive notifications from this channel.

            Enter the emails separated by commas, and no spaces. For example: user1@domain.com,user2@domain.com,user3@domain.com

          2. The first time you configure an email channel, you must also specify the SMTP server information:

            • Host: hostname or IP address of the SMTP server
            • Port: port of the SMTP server (typically 587)
            • User ID/Password: your administrative credentials
            • From: email address that indicates who sent the event messages

            After the first time, any additional email channels you create can use this configuration, by clicking Existing.

          3. Click Add.

          4. (Optional) To verify the channel configuration, click Test.

          To create and verify the specification of an email channel, run:

          netq add notification channel email <text-channel-name> to <text-email-toids> [smtpserver <text-email-hostname>] [smtpport <text-email-port>] [login <text-email-id>] [password <text-email-password>] [severity info | severity error ]
          netq add notification channel email <text-channel-name> to <text-email-toids>
          netq show notification channel [json]
          

          The configuration is different depending on whether you are using the on-premises or cloud version of NetQ. Do not configure SMTP for cloud deployments as the NetQ cloud service uses the NetQ SMTP server to push email notifications.

          For an on-premises deployment:

          1. Set up an SMTP server. The server can be internal or public.

          2. Create a user account (login and password) on the SMTP server. NetQ sends notifications to this address.

          3. Create the notification channel using this form of the CLI command:

            netq add notification channel email <text-channel-name> to <text-email-toids>  [smtpserver <text-email-hostname>] [smtpport <text-email-port>] [login <text-email-id>] [password <text-email-password>] [severity info | severity error ]
            
          For example:
          cumulus@switch:~$ netq add notification channel email onprem-email to netq-notifications@domain.com smtpserver smtp.domain.com smtpport 587 login smtphostlogin@domain.com password MyPassword123
          Successfully added/updated channel onprem-email
          
          1. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            onprem-email    email            info             password: MyPassword123,
                                                              port: 587,
                                                              isEncrypted: True,
                                                              host: smtp.domain.com,
                                                              from: smtphostlogin@doma
                                                              in.com,
                                                              id: smtphostlogin@domain
                                                              .com,
                                                              to: netq-notifications@d
                                                              omain.com
            

          For a cloud deployment:

          1. Create the notification channel using this form of the CLI command:

            netq add notification channel email <text-channel-name> to <text-email-toids>
            
          For example:
          cumulus@switch:~$ netq add notification channel email cloud-email to netq-cloud-notifications@domain.com
          Successfully added/updated channel cloud-email
          
          1. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            cloud-email    email            info             password: TEiO98BOwlekUP
                                                             TrFev2/Q==, port: 587,
                                                             isEncrypted: True,
                                                             host: netqsmtp.domain.com,
                                                             from: netqsmtphostlogin@doma
                                                             in.com,
                                                             id: smtphostlogin@domain
                                                             .com,
                                                             to: netq-notifications@d
                                                             omain.com
            

          You can use the NetQ UI or the NetQ CLI to create a generic channel.

          1. Click Menu, then click Notification Channels.

          2. Click Generic.

          3. Add a channel.

            • When no channels have been specified, click Add generic channel.
            • When at least one channel has been specified, click above the table.
          4. Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.

          1. Enter a Webhook URL that you want to receive the notifications from this channel.

          2. Set the desired notification severity, SSL, and authentication parameters for this channel.

          3. Click Add.

          4. (Optional) To verify the channel configuration, click Test.

          To create and verify a generic channel, run:

          netq add notification channel generic <text-channel-name> webhook <text-webhook-url> [severity info | severity error ] [use-ssl True | use-ssl False] [auth-type basic-auth generic-username <text-generic-username> generic-password <text-generic-password> | auth-type api-key key-name <text-api-key-name> key-value <text-api-key-value>]
          netq show notification channel [json]
          

          Create a Rule

          The second step is to create and verify a rule that accepts a set of events. You create rules for system events using the NetQ CLI.

          To create and verify a rule, run:

          netq add notification rule <text-rule-name> key <text-rule-key> value <text-rule-value>
          netq show notification rule [json]
          

          Refer to Configure System Event Notifications for a list of available keys and values.

          This example creates a rule named all-interfaces, using the key ifname and the value ALL, which sends all events from all interfaces to any channel with this rule.

          cumulus@switch:~$ netq add notification rule all-interfaces key ifname value ALL
          Successfully added/updated rule all-ifs
          
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          all-interfaces  ifname           ALL
          

          Refer to Advanced Configuration to create rules based on thresholds.

          Create a Filter

          The final step is to create a filter to tie the rule to the channel. You create filters for system events using the NetQ CLI.

          To create and verify a filter, run:

          netq add notification filter <text-filter-name> rule <text-rule-name-anchor> channel <text-channel-name-anchor>
          netq show notification filter [json]
          

          These examples use the channels created in the Configure System Event Notifications topic and the rule created in the Configure System Event Notifications topic.

          cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-interfaces channel pd-netq-events
          Successfully added/updated filter notify-all-ifs
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          notify-all-ifs  1          info             pd-netq-events   all-interfaces
          
          cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-interfaces channel slk-netq-events
          Successfully added/updated filter notify-all-ifs
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          notify-all-ifs  1          info             slk-netq-events   all-interfaces
          
          cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-interfaces channel syslog-netq-events
          Successfully added/updated filter notify-all-ifs
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          notify-all-ifs  1          info             syslog-netq-events all-ifs
          
          cumulus@switch:~$ netq add notification filter notify-all-ifs rule all-interfaces channel onprem-email
          Successfully added/updated filter notify-all-ifs
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          notify-all-ifs  1          info             onprem-email all-ifs
          

          NetQ is now configured to send all interface events to your selected channel.

          Refer to Advanced Configuration to create filters for threshold-based events.

          Configure Advanced NetQ Event Notifications

          If you want to create more granular notifications based on such items as selected devices, characteristics of devices, or protocols, or you want to use a proxy server, you need more than the basic notification configuration. The following section includes details for creating these more complex notification configurations.

          Configure a Proxy Server

          To send notification messages through a proxy server instead of directly to a notification channel, you configure NetQ with the hostname and optionally a port of a proxy server. If you do not specify a port, NetQ defaults to port 80. Only one proxy server is currently supported. To simplify deployment, configure your proxy server before configuring channels, rules, or filters.

          To configure and verify the proxy server, run:

          netq add notification proxy <text-proxy-hostname> [port <text-proxy-port>]
          netq show notification proxy
          

          This example configures and verifies the proxy4 server on port 80 to act as a proxy for event notifications.

          cumulus@switch:~$ netq add notification proxy proxy4
          Successfully configured notifier proxy proxy4:80
          
          cumulus@switch:~$ netq show notification proxy
          Matching config_notify records:
          Proxy URL          Slack Enabled              PagerDuty Enabled
          ------------------ -------------------------- ----------------------------------
          proxy4:80          yes                        yes
          

          Create Channels

          Create one or more PagerDuty, Slack, syslog, email, or generic channels to receive notifications.

          NetQ sends notifications to PagerDuty as PagerDuty events.

          For example:

          To create and verify a PagerDuty channel, run:

          netq add notification channel pagerduty <text-channel-name> integration-key <text-integration-key> [severity info | severity error]
          netq show notification channel [json]
          

          where:

          OptionDescription
          <text-channel-name>User-specified PagerDuty channel name
          integration-key <text-integration-key>The integration key is also called the service_key or routing_key. The default is an empty string ("").
          severity <level>(Optional) The log level, either info or error. The severity defaults to info if unspecified.

          This example shows the creation of a pd-netq-events channel and verifies the configuration.

          1. Obtain an integration key as described in this PagerDuty support page.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key c6d666e210a8425298ef7abde0d1998
            Successfully added/updated channel pd-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            pd-netq-events  pagerduty        info             integration-key: c6d666e
                                                            210a8425298ef7abde0d1998
            

          NetQ Notifier sends notifications to Slack as incoming webhooks for a Slack channel you configure.

          For example:

          To create and verify a Slack channel, run:

          netq add notification channel slack <text-channel-name> webhook <text-webhook-url> [severity info|severity error] [tag <text-slack-tag>]
          netq show notification channel [json]
          

          where:

          OptionDescription
          <text-channel-name>User-specified Slack channel name
          webhook <text-webhook-url>WebHook URL for the desired channel. For example: https://hooks.slack.com/services/text/moretext/evenmoretext
          severity <level>The log level, either info or error. The severity defaults to info if unspecified.
          tag <text-slack-tag>Optional tag appended to the Slack notification to highlight particular channels or people. An @ sign must precede the tag value. For example, @netq-info.

          This example shows the creation of a slk-netq-events channel and verifies the configuration.

          1. Create an incoming webhook as described in the documentation for your version of Slack.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext severity error tag @netq-ops
            Successfully added/updated channel slk-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            slk-netq-events slack            error            tag: @netq-ops,
                                                              webhook: https://hooks.s
                                                              lack.com/services/text/m
                                                              oretext/evenmoretext
            

          To create and verify a syslog channel, run:

          netq add notification channel syslog <text-channel-name> hostname <text-syslog-hostname> port <text-syslog-port> [severity info | severity error ]
          netq show notification channel [json]
          

          where:

          OptionDescription
          <text-channel-name>User-specified syslog channel name
          hostname <text-syslog-hostname>Hostname or IP address of the syslog server to receive notifications
          port <text-syslog-port>Port on the syslog server to receive notifications
          severity <level>The log level, either info or error. The severity defaults to info if unspecified.

          This example shows the creation of a syslog-netq-events channel and verifies the configuration.

          1. Obtain the syslog server hostname (or IP address) and port.

          2. Create the channel.

            cumulus@switch:~$ netq add notification channel syslog syslog-netq-events hostname syslog-server port 514 severity error
            Successfully added/updated channel syslog-netq-events
            
          3. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity Channel Info
            --------------- ---------------- -------- ----------------------
            syslog-netq-eve syslog           error     host:syslog-server
            nts                                        port: 514
            

          The configuration is different depending on whether you are using the on-premises or cloud version of NetQ.

          To create an email notification channel for an on-premises deployment, run:

          netq add notification channel email <text-channel-name> to <text-email-toids>  [smtpserver <text-email-hostname>] [smtpport <text-email-port>] [login <text-email-id>] [password <text-email-password>] [severity info | severity error ]
          

          This example creates an email channel named onprem-email that uses the smtpserver on port 587 to send messages to those persons with access to the smtphostlogin account.

          1. Set up an SMTP server. The server can be internal or public.

          2. Create a user account (login and password) on the SMTP server. NetQ sends notifications to this address.

          3. Create the notification channel.

            cumulus@switch:~$ netq add notification channel email onprem-email to netq-notifications@domain.com smtpserver smtp.domain.com smtpport 587 login smtphostlogin@domain.com password MyPassword123 severity error
            Successfully added/updated channel onprem-email
            
          4. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            onprem-email    email            error            password: MyPassword123,
                                                              port: 587,
                                                              isEncrypted: True,
                                                              host: smtp.domain.com,
                                                              from: smtphostlogin@doma
                                                              in.com,
                                                              id: smtphostlogin@domain
                                                              .com,
                                                              to: netq-notifications@d
                                                              omain.com
            

          In cloud deployments as the NetQ cloud service uses the NetQ SMTP server to push email notifications.

          To create an email notification channel for a cloud deployment, run:

          netq add notification channel email <text-channel-name> to <text-email-toids> [severity info | severity error]
          netq show notification channel [json]
          

          This example creates an email channel named cloud-email that uses the NetQ SMTP server to send messages to those persons with access to the netq-cloud-notifications account.

          1. Create the channel.

            cumulus@switch:~$ netq add notification channel email cloud-email to netq-cloud-notifications@domain.com severity error
            Successfully added/updated channel cloud-email
            
          2. Verify the configuration.

            cumulus@switch:~$ netq show notification channel
            Matching config_notify records:
            Name            Type             Severity         Channel Info
            --------------- ---------------- ---------------- ------------------------
            cloud-email    email            error            password: TEiO98BOwlekUP
                                                             TrFev2/Q==, port: 587,
                                                             isEncrypted: True,
                                                             host: netqsmtp.domain.com,
                                                             from: netqsmtphostlogin@doma
                                                             in.com,
                                                             id: smtphostlogin@domain
                                                             .com,
                                                             to: netq-notifications@d
                                                             omain.com
            

          To create and verify a generic channel, run:

          netq add notification channel generic <text-channel-name> webhook <text-webhook-url> [severity info | severity error ] [use-ssl True | use-ssl False] [auth-type basic-auth generic-username <text-generic-username> generic-password <text-generic-password> | auth-type api-key key-name <text-api-key-name> key-value <text-api-key-value>
          netq show notification channel [json]
          

          where:

          OptionDescription
          <text-channel-name>User-specified generic channel name
          webhook <text-webhook-url>URL of the remote application to receive notifications
          severity <level>The log level, either info or error. The severity defaults to info if unspecified.
          use-ssl [True | False]Enable or disable SSL
          auth-type [basic-auth | api-key]Set authentication parameters. Either basic-auth with generic-username and generic-password or api-key with a key-name and key-value

          Create Rules

          A single key-value pair comprises each rule. The key-value pair indicates what messages to include or drop from event information sent to a notification channel. You can create more than one rule for a single filter. Creating multiple rules for a given filter can provide a very defined filter. For example, you can specify rules around hostnames or interface names, enabling you to filter messages specific to those hosts or interfaces. You can only create rules after you have set up your notification channels.

          NetQ includes a predefined fixed set of valid rule keys. You enter values as regular expressions, which vary according to your deployment.

          Rule Keys and Values

          ServiceRule KeyDescriptionExample Rule Values
          BGPmessage_typeNetwork protocol or service identifierbgp
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf11, exit01, spine-4
          peerUser-defined, text-based name for a peer switch or hostserver4, leaf-3, exit02, spine06
          descText description
          vrfName of VRF interfacemgmt, default
          old_statePrevious state of the BGP serviceEstablished, Failed
          new_stateCurrent state of the BGP serviceEstablished, Failed
          old_last_reset_timePrevious time that BGP service was resetApr3, 2019, 4:17 PM
          new_last_reset_timeMost recent time that BGP service was resetApr8, 2019, 11:38 AM
          ConfigDiffmessage_typeNetwork protocol or service identifierconfigdiff
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf11, exit01, spine-4
          vniVirtual Network Instance identifier12, 23
          old_statePrevious state of the configuration filecreated, modified
          new_stateCurrent state of the configuration filecreated, modified
          EVPNmessage_typeNetwork protocol or service identifierevpn
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
          vniVirtual Network Instance identifier12, 23
          old_in_kernel_statePrevious VNI state, in kernel or nottrue, false
          new_in_kernel_stateCurrent VNI state, in kernel or nottrue, false
          old_adv_all_vni_statePrevious VNI advertising state, advertising all or nottrue, false
          new_adv_all_vni_stateCurrent VNI advertising state, advertising all or nottrue, false
          LCMmessage_typeNetwork protocol or service identifierclag
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
          old_conflicted_bondsPrevious pair of interfaces in a conflicted bondswp7 swp8, swp3 swp4
          new_conflicted_bondsCurrent pair of interfaces in a conflicted bondswp11 swp12, swp23 swp24
          old_state_protodownbondPrevious state of the bondprotodown, up
          new_state_protodownbondCurrent state of the bondprotodown, up
          Linkmessage_typeNetwork protocol or service identifierlink
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-6, exit01, spine7
          ifnameSoftware interface nameeth0, swp53
          LLDPmessage_typeNetwork protocol or service identifierlldp
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf41, exit01, spine-5, tor-36
          ifnameSoftware interface nameeth1, swp12
          old_peer_ifnamePrevious software interface nameeth1, swp12, swp27
          new_peer_ifnameCurrent software interface nameeth1, swp12, swp27
          old_peer_hostnamePrevious user-defined, text-based name for a peer switch or hostserver02, leaf41, exit01, spine-5, tor-36
          new_peer_hostnameCurrent user-defined, text-based name for a peer switch or hostserver02, leaf41, exit01, spine-5, tor-36
          MLAG (CLAG)message_typeNetwork protocol or service identifierclag
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
          old_conflicted_bondsPrevious pair of interfaces in a conflicted bondswp7 swp8, swp3 swp4
          new_conflicted_bondsCurrent pair of interfaces in a conflicted bondswp11 swp12, swp23 swp24
          old_state_protodownbondPrevious state of the bondprotodown, up
          new_state_protodownbondCurrent state of the bondprotodown, up
          Nodemessage_typeNetwork protocol or service identifiernode
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf41, exit01, spine-5, tor-36
          ntp_stateCurrent state of NTP servicein sync, not sync
          db_stateCurrent state of DBAdd, Update, Del, Dead
          NTPmessage_typeNetwork protocol or service identifierntp
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-9, exit01, spine04
          old_statePrevious state of servicein sync, not sync
          new_stateCurrent state of servicein sync, not sync
          Portmessage_typeNetwork protocol or service identifierport
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf13, exit01, spine-8, tor-36
          ifnameInterface nameeth0, swp14
          old_speedPrevious speed rating of port10 G, 25 G, 40 G, unknown
          old_transreceiverPrevious transceiver40G Base-CR4, 25G Base-CR
          old_vendor_namePrevious vendor name of installed port moduleAmphenol, OEM, NVIDIA, Fiberstore, Finisar
          old_serial_numberPrevious serial number of installed port moduleMT1507VS05177, AVE1823402U, PTN1VH2
          old_supported_fecPrevious forward error correction (FEC) support statusnone, Base R, RS
          old_advertised_fecPrevious FEC advertising statetrue, false, not reported
          old_fecPrevious FEC capabilitynone
          old_autonegPrevious activation state of auto-negotiationon, off
          new_speedCurrent speed rating of port10 G, 25 G, 40 G
          new_transreceiverCurrent transceiver40G Base-CR4, 25G Base-CR
          new_vendor_nameCurrent vendor name of installed port moduleAmphenol, OEM, NVIDIA, Fiberstore, Finisar
          new_part_numberCurrent part number of installed port moduleSFP-H10GB-CU1M, MC3309130-001, 603020003
          new_serial_numberCurrent serial number of installed port moduleMT1507VS05177, AVE1823402U, PTN1VH2
          new_supported_fecCurrent FEC support statusnone, Base R, RS
          new_advertised_fecCurrent FEC advertising statetrue, false
          new_fecCurrent FEC capabilitynone
          new_autonegCurrent activation state of auto-negotiationon, off
          SensorssensorNetwork protocol or service identifierFan: fan1, fan-2
          Power Supply Unit: psu1, psu2
          Temperature: psu1temp1, temp2
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf-26, exit01, spine2-4
          old_statePrevious state of a fan, power supply unit, or thermal sensorFan: ok, absent, bad
          PSU: ok, absent, bad
          Temp: ok, busted, bad, critical
          new_stateCurrent state of a fan, power supply unit, or thermal sensorFan: ok, absent, bad
          PSU: ok, absent, bad
          Temp: ok, busted, bad, critical
          old_s_statePrevious state of a fan or power supply unit.Fan: up, down
          PSU: up, down
          new_s_stateCurrent state of a fan or power supply unit.Fan: up, down
          PSU: up, down
          new_s_maxCurrent maximum temperature threshold valueTemp: 110
          new_s_critCurrent critical high temperature threshold valueTemp: 85
          new_s_lcritCurrent critical low temperature threshold valueTemp: -25
          new_s_minCurrent minimum temperature threshold valueTemp: -50
          Servicesmessage_typeNetwork protocol or service identifierservices
          hostnameUser-defined, text-based name for a switch or hostserver02, leaf03, exit01, spine-8
          nameName of serviceclagd, lldpd, ssh, ntp, netqd, netq-agent
          old_pidPrevious process or service identifier12323, 52941
          new_pidCurrent process or service identifier12323, 52941
          old_statusPrevious status of serviceup, down
          new_statusCurrent status of serviceup, down

          Rule names are case sensitive, and you cannot use wildcards. Rule names can contain spaces, but you must enclose them with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use *bgpSessionChanges* or *BGP-session-changes* or *BGPsessions*, instead of *BGP Session Changes*. Use tab completion to view the command options syntax.

          Example Rules

          Create a BGP rule based on hostname:

          cumulus@switch:~$ netq add notification rule bgpHostname key hostname value spine-01
          Successfully added/updated rule bgpHostname 
          

          Create a rule based on a configuration file state change:

          cumulus@switch:~$ netq add notification rule sysconf key configdiff value updated
          Successfully added/updated rule sysconf
          

          Create an EVPN rule based on a VNI:

          cumulus@switch:~$ netq add notification rule evpnVni key vni value 42
          Successfully added/updated rule evpnVni
          

          Create an interface rule based on FEC support:

          cumulus@switch:~$ netq add notification rule fecSupport key new_supported_fec value supported
          Successfully added/updated rule fecSupport
          

          Create a service rule based on a status change:

          cumulus@switch:~$ netq add notification rule svcStatus key new_status value down
          Successfully added/updated rule svcStatus
          

          Create a sensor rule based on a threshold:

          cumulus@switch:~$ netq add notification rule overTemp key new_s_crit value 24
          Successfully added/updated rule overTemp
          

          Create an interface rule based on port:

          cumulus@switch:~$ netq add notification rule swp52 key port value swp52
          Successfully added/updated rule swp52 
          

          View Rule Configurations

          Use the netq show notification command to view the rules on your platform.

          Create Filters

          You can limit or direct event messages using filters. Filters are created based on rules you define and each filter contains one or more rules. When a message matches the rule, it is sent to the indicated destination. Before you can create filters, you need to have already defined rules and configured channels.

          As you create filters, they are added to the bottom of a list of filters. By default, NetQ processes event messages against filters starting at the top of the filter list and works its way down until it finds a match. NetQ applies the first filter that matches an event message, ignoring the other filters. Then it moves to the next event message and reruns the process, starting at the top of the list of filters. NetQ ignores events that do not match any filter.

          You might have to change the order of filters in the list to ensure you capture the events you want and drop the events you do not want. This is possible using the before or after keywords to ensure one rule is processed before or after another.

          This diagram shows an example with four defined filters with sample output results.

          Filter names can contain spaces, but must be enclosed with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use bgpSessionChanges or BGP-session-changes or BGPsessions, instead of 'BGP Session Changes'. Filter names are also case sensitive.

          Example Filters

          Create a filter for BGP events on a particular device:

          cumulus@switch:~$ netq add notification filter bgpSpine rule bgpHostname channel pd-netq-events
          Successfully added/updated filter bgpSpine
          

          Create a filter for a given VNI in your EVPN overlay:

          cumulus@switch:~$ netq add notification filter vni42 severity warning rule evpnVni channel pd-netq-events
          Successfully added/updated filter vni42
          

          Create a filter for when a configuration file is updated:

          cumulus@switch:~$ netq add notification filter configChange severity info rule sysconf channel slk-netq-events
          Successfully added/updated filter configChange
          

          Create a filter to monitor ports with FEC support:

          cumulus@switch:~$ netq add notification filter newFEC rule fecSupport channel slk-netq-events
          Successfully added/updated filter newFEC
          

          Create a filter to monitor for services that change to a down state:

          cumulus@switch:~$ netq add notification filter svcDown severity error rule svcStatus channel slk-netq-events
          Successfully added/updated filter svcDown
          

          Create a filter to monitor overheating platforms:

          cumulus@switch:~$ netq add notification filter critTemp severity error rule overTemp channel onprem-email
          Successfully added/updated filter critTemp
          

          Create a filter to drop messages from a given interface, and match against this filter before any other filters. To create a drop-style filter, do not specify a channel. To list the filter first, use the before option.

          cumulus@switch:~$ netq add notification filter swp52Drop severity error rule swp52 before bgpSpine
          Successfully added/updated filter swp52Drop
          

          View the Filter Configurations

          Use the netq show notification command to view the filters on your platform.

          Reorder Filters

          To reorder the events filters, use the before and after options. For example, to put two critical event filters just below a drop filter:

          cumulus@switch:~$ netq add notification filter critTemp after swp52Drop
          Successfully added/updated filter critTemp
          cumulus@switch:~$ netq add notification filter svcDown before bgpSpine
          Successfully added/updated filter svcDown
          

          You do not need to reenter all the severity, channel, and rule information for existing rules if you only want to change their processing order.

          Run the netq show notification command again to verify the changes.

          Suppress Events

          Suppressing events reduces the number of event notifications NetQ displays. You can create rules to suppress events attributable to known issues or false alarms. In addition to the rules you create to suppress events, NetQ suppresses some events by default.

          You can suppress events for the following types of messages:

          NetQ suppresses BGP, EVPN, link, and sensor-related events with a severity level of "info" by default in the UI. You can disable this rule if you'd prefer to receive these event notifications.

          Create an Event Suppression Configuration

          To suppress events using the NetQ UI:

          1. Click Menu, then Events.
          2. In the top-right corner, select Show suppression rules.
          3. Select Add rule. You can configure individual suppression rules or you can create a group rule that suppresses events for all message types.
          1. Enter the suppression rule parameters and click Create.

          When you add a new configuration using the CLI, you can specify a scope, which limits the suppression in the following order:

          1. Hostname.
          2. Severity.
          3. Message type-specific filters. For example, the target VNI for EVPN messages, or the interface name for a link message.

          NetQ has a predefined set of filter conditions. To see these conditions, run netq show events-config show-filter-conditions:

          cumulus@switch:~$ netq show events-config show-filter-conditions
          Matching config_events records:
          Message Name             Filter Condition Name                      Filter Condition Hierarchy                           Filter Condition Description
          ------------------------ ------------------------------------------ ---------------------------------------------------- --------------------------------------------------------
          evpn                     vni                                        3                                                    Target VNI
          evpn                     severity                                   2                                                    Severity error/info
          evpn                     hostname                                   1                                                    Target Hostname
          clsupport                fileAbsName                                3                                                    Target File Absolute Name
          clsupport                severity                                   2                                                    Severity error/info
          clsupport                hostname                                   1                                                    Target Hostname
          link                     new_state                                  4                                                    up / down
          link                     ifname                                     3                                                    Target Ifname
          link                     severity                                   2                                                    Severity error/info
          link                     hostname                                   1                                                    Target Hostname
          ospf                     ifname                                     3                                                    Target Ifname
          ospf                     severity                                   2                                                    Severity error/info
          ospf                     hostname                                   1                                                    Target Hostname
          sensor                   new_s_state                                4                                                    New Sensor State Eg. ok
          sensor                   sensor                                     3                                                    Target Sensor Name Eg. Fan, Temp
          sensor                   severity                                   2                                                    Severity error/info
          sensor                   hostname                                   1                                                    Target Hostname
          configdiff               old_state                                  5                                                    Old State
          configdiff               new_state                                  4                                                    New State
          configdiff               type                                       3                                                    File Name
          configdiff               severity                                   2                                                    Severity error/info
          configdiff               hostname                                   1                                                    Target Hostname
          ssdutil                  info                                       3                                                    low health / significant health drop
          ssdutil                  severity                                   2                                                    Severity error/info
          ssdutil                  hostname                                   1                                                    Target Hostname
          agent                    db_state                                   3                                                    Database State
          agent                    severity                                   2                                                    Severity error/info
          agent                    hostname                                   1                                                    Target Hostname
          ntp                      new_state                                  3                                                    yes / no
          ntp                      severity                                   2                                                    Severity error/info
          ntp                      hostname                                   1                                                    Target Hostname
          bgp                      vrf                                        4                                                    Target VRF
          bgp                      peer                                       3                                                    Target Peer
          bgp                      severity                                   2                                                    Severity error/info
          bgp                      hostname                                   1                                                    Target Hostname
          services                 new_status                                 4                                                    active / inactive
          services                 name                                       3                                                    Target Service Name Eg.netqd, mstpd, zebra
          services                 severity                                   2                                                    Severity error/info
          services                 hostname                                   1                                                    Target Hostname
          btrfsinfo                info                                       3                                                    high btrfs allocation space / data storage efficiency
          btrfsinfo                severity                                   2                                                    Severity error/info
          btrfsinfo                hostname                                   1                                                    Target Hostname
          clag                     severity                                   2                                                    Severity error/info
          clag                     hostname                                   1                                                    Target Hostname
          

          For example, to create a configuration called mybtrfs that suppresses OSPF-related events on leaf01 for the next 10 minutes, run:

          netq add events-config events_config_name mybtrfs message_type ospf scope '[{"scope_name":"hostname","scope_value":"leaf01"},{"scope_name":"severity","scope_value":"*"}]' suppress_until 600
          

          Delete or Disable an Event Suppression Rule

          You can delete or disable suppression rules. After you delete a rule, event notifications will resume. Disabling suppression rules pauses those rules, allowing you to receive event notifications temporarily.

          To remove suppressed event configurations:

          1. Click Menu, then Events.
          2. Select Show suppression rules at the top of the page.
          3. Toggle between the Single and All tabs to view the suppression rules. Navigate to the rule you would like to delete or disable.
          4. Click the three-dot menu and select Delete. If you’d like to pause the rule instead of deleting it, click Disable.

          To remove an event suppression configuration, run netq del events-config events_config_id <text-events-config-id-anchor>.

          cumulus@switch:~$ netq del events-config events_config_id eventsconfig_10
          Successfully deleted Events Config eventsconfig_10
          

          Show Event Suppression Rules

          To view suppressed events:

          1. Click Menu, then Events.
          2. Select Show suppression rules at the top of the page.
          3. Toggle between the Single and All tabs to view individual and group rules, respectively.

          You can view all event suppression configurations, or you can filter by a specific configuration or message type.

          cumulus@switch:~$ netq show events-config events_config_id eventsconfig_1
          Matching config_events records:
          Events Config ID     Events Config Name   Message Type         Scope                                                        Active Suppress Until
          -------------------- -------------------- -------------------- ------------------------------------------------------------ ------ --------------------
          eventsconfig_1       job_cl_upgrade_2d89c agent                {"db_state":"*","hostname":"spine02","severity":"*"}         True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine02
          eventsconfig_1       job_cl_upgrade_2d89c bgp                  {"vrf":"*","peer":"*","hostname":"spine04","severity":"*"}   True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c btrfsinfo            {"hostname":"spine04","info":"*","severity":"*"}             True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c clag                 {"hostname":"spine04","severity":"*"}                        True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c clsupport            {"fileAbsName":"*","hostname":"spine04","severity":"*"}      True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c configdiff           {"new_state":"*","old_state":"*","type":"*","hostname":"spin True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                      e04","severity":"*"}                                                2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c evpn                 {"hostname":"spine04","vni":"*","severity":"*"}              True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c link                 {"ifname":"*","new_state":"*","hostname":"spine04","severity True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                      ":"*"}                                                              2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c ntp                  {"new_state":"*","hostname":"spine04","severity":"*"}        True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c ospf                 {"ifname":"*","hostname":"spine04","severity":"*"}           True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c sensor               {"sensor":"*","new_s_state":"*","hostname":"spine04","severi True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                      ty":"*"}                                                            2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c services             {"new_status":"*","name":"*","hostname":"spine04","severity" True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                      :"*"}                                                               2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_1       job_cl_upgrade_2d89c ssdutil              {"hostname":"spine04","info":"*","severity":"*"}             True   Tue Jul  7 16:16:20
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               spine04
          eventsconfig_10      job_cl_upgrade_2d89c btrfsinfo            {"hostname":"fw2","info":"*","severity":"*"}                 True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          eventsconfig_10      job_cl_upgrade_2d89c clag                 {"hostname":"fw2","severity":"*"}                            True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          eventsconfig_10      job_cl_upgrade_2d89c clsupport            {"fileAbsName":"*","hostname":"fw2","severity":"*"}          True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          eventsconfig_10      job_cl_upgrade_2d89c link                 {"ifname":"*","new_state":"*","hostname":"fw2","severity":"* True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                      "}                                                                  2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          eventsconfig_10      job_cl_upgrade_2d89c ospf                 {"ifname":"*","hostname":"fw2","severity":"*"}               True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                                                                                          2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          eventsconfig_10      job_cl_upgrade_2d89c sensor               {"sensor":"*","new_s_state":"*","hostname":"fw2","severity": True   Tue Jul  7 16:16:22
                               21b3effd79796e585c35                      "*"}                                                                2020
                               096d5fc6cef32b463e37
                               cca88d8ee862ae104d5_
                               fw2
          

          When you filter for a message type, you must include the show-filter-conditions keyword to show the conditions associated with that message type and the hierarchy in which they get processed.

          cumulus@switch:~$ netq show events-config message_type evpn show-filter-conditions
          Matching config_events records:
          Message Name             Filter Condition Name                      Filter Condition Hierarchy                           Filter Condition Description
          ------------------------ ------------------------------------------ ---------------------------------------------------- --------------------------------------------------------
          evpn                     vni                                        3                                                    Target VNI
          evpn                     severity                                   2                                                    Severity error/info
          evpn                     hostname                                   1                                                    Target Hostname
          

          Examples of Advanced Notification Configurations

          The following section lists examples of advanced notification configurations.

          Create a Notification for BGP Events from a Selected Switch

          This example creates a notification integration with a PagerDuty channel called pd-netq-events. It then creates a rule bgpHostname and a filter called 4bgpSpine for any notifications from spine-01. The result is that any info severity event messages from Spine-01 is filtered to the pd-netq-events channel.

          cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
          Successfully added/updated channel pd-netq-events
          cumulus@switch:~$ netq add notification rule bgpHostname key node value spine-01
          Successfully added/updated rule bgpHostname
           
          cumulus@switch:~$ netq add notification filter bgpSpine rule bgpHostname channel pd-netq-events
          Successfully added/updated filter bgpSpine
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity         Channel Info
          --------------- ---------------- ---------------- ------------------------
          pd-netq-events  pagerduty        info             integration-key: 1234567
                                                            890   
          
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
           
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                       e
          

          Create a Notification for Errors on a Given EVPN VNI

          This example creates a notification integration with a PagerDuty channel called pd-netq-events. It then creates a rule evpnVni and a filter called 3vni42 for any error messages from VNI 42 on the EVPN overlay network. The result is that any event messages from VNI 42 with a severity level of ‘error’ are filtered to the pd-netq-events channel.

          cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
          Successfully added/updated channel pd-netq-events
           
          cumulus@switch:~$ netq add notification rule evpnVni key vni value 42
          Successfully added/updated rule evpnVni
           
          cumulus@switch:~$ netq add notification filter vni42 rule evpnVni channel pd-netq-events
          Successfully added/updated filter vni42
           
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity         Channel Info
          --------------- ---------------- ---------------- ------------------------
          pd-netq-events  pagerduty        info             integration-key: 1234567
                                                            890   
          
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
           
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                       e
          vni42           2          error            pd-netq-events   evpnVni
          

          Create a Notification for Configuration File Changes

          This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule sysconf and a filter called configChange for any configuration file update messages. The result is that any configuration update messages are filtered to the slk-netq-events channel.

          cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
          Successfully added/updated channel slk-netq-events
           
          cumulus@switch:~$ netq add notification rule sysconf key message_type value configdiff
          Successfully added/updated rule sysconf
           
          cumulus@switch:~$ netq add notification filter configChange severity info rule sysconf channel slk-netq-events
          Successfully added/updated filter configChange
           
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity Channel Info
          --------------- ---------------- -------- ----------------------
          slk-netq-events slack            info     webhook:https://hooks.s
                                                    lack.com/services/text/
                                                    moretext/evenmoretext     
           
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
          sysconf         message_type     configdiff 
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                       e
          vni42           2          error            pd-netq-events   evpnVni
          configChange    3          info             slk-netq-events  sysconf
          

          Create a Notification for When a Service Goes Down

          This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule svcStatus and a filter called svcDown for any services state messages indicating a service is no longer operational. The result is that any service down messages are filtered to the slk-netq-events channel.

          cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
          Successfully added/updated channel slk-netq-events
           
          cumulus@switch:~$ netq add notification rule svcStatus key new_status value down
          Successfully added/updated rule svcStatus
           
          cumulus@switch:~$ netq add notification filter svcDown severity error rule svcStatus channel slk-netq-events
          Successfully added/updated filter svcDown
           
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity Channel Info
          --------------- ---------------- -------- ----------------------
          slk-netq-events slack            info     webhook:https://hooks.s
                                                    lack.com/services/text/
                                                    moretext/evenmoretext     
           
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
          svcStatus       new_status       down
          sysconf         configdiff       updated
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          bgpSpine        1          info             pd-netq-events   bgpHostnam
                                                                       e
          vni42           2          error            pd-netq-events   evpnVni
          configChange    3          info             slk-netq-events  sysconf
          svcDown         4          error            slk-netq-events  svcStatus
          

          Create a Filter to Drop Notifications from a Given Interface

          This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule swp52 and a filter called swp52Drop that drops all notifications for events from interface swp52.

          cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
          Successfully added/updated channel slk-netq-events
           
          cumulus@switch:~$ netq add notification rule swp52 key port value swp52
          Successfully added/updated rule swp52
           
          cumulus@switch:~$ netq add notification filter swp52Drop severity error rule swp52 before bgpSpine
          Successfully added/updated filter swp52Drop
           
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity Channel Info
          --------------- ---------------- -------- ----------------------
          slk-netq-events slack            info     webhook:https://hooks.s
                                                    lack.com/services/text/
                                                    moretext/evenmoretext     
           
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
          svcStatus       new_status       down
          swp52           port             swp52
          sysconf         configdiff       updated
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          swp52Drop       1          error            NetqDefaultChann swp52
                                                      el
          bgpSpine        2          info             pd-netq-events   bgpHostnam
                                                                       e
          vni42           3          error            pd-netq-events   evpnVni
          configChange    4          info             slk-netq-events  sysconf
          svcDown         5          error            slk-netq-events  svcStatus
          

          Create a Notification for a Given Device that Has a Tendency to Overheat (Using Multiple Rules)

          This example creates a notification when switch leaf04 has passed over the high temperature threshold. Two rules were necessary to create this notification, one to identify the specific device and one to identify the temperature trigger. NetQ then sends the message to the pd-netq-events channel.

          cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
          Successfully added/updated channel pd-netq-events
           
          cumulus@switch:~$ netq add notification rule switchLeaf04 key hostname value leaf04
          Successfully added/updated rule switchLeaf04
          cumulus@switch:~$ netq add notification rule overTemp key new_s_crit value 24
          Successfully added/updated rule overTemp
           
          cumulus@switch:~$ netq add notification filter critTemp rule switchLeaf04 channel pd-netq-events
          Successfully added/updated filter critTemp
          cumulus@switch:~$ netq add notification filter critTemp severity critical rule overTemp channel pd-netq-events
          Successfully added/updated filter critTemp
           
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity         Channel Info
          --------------- ---------------- ---------------- ------------------------
          pd-netq-events  pagerduty        info             integration-key: 1234567
                                                            890
          
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
          overTemp        new_s_crit       24
          svcStatus       new_status       down
          switchLeaf04    hostname         leaf04
          swp52           port             swp52
          sysconf         configdiff       updated
          
          cumulus@switch:~$ netq show notification filter
          Matching config_notify records:
          Name            Order      Severity         Channels         Rules
          --------------- ---------- ---------------- ---------------- ----------
          swp52Drop       1          error            NetqDefaultChann swp52
                                                      el
          bgpSpine        2          info             pd-netq-events   bgpHostnam
                                                                       e
          vni42           3          error            pd-netq-events   evpnVni
          configChange    4          info             slk-netq-events  sysconf
          svcDown         5          error            slk-netq-events  svcStatus
          critTemp        6          error            pd-netq-events   switchLeaf
                                                                       04
                                                                       overTemp
          

          Manage NetQ Event Notification Integrations

          You might need to modify event notification configurations at some point in the lifecycle of your deployment. You can add channels, rules, filters, and a proxy at any time. You can remove channels, rules, and filters if they are not part of an existing notification configuration.

          For integrations with threshold-based event notifications, refer to Configure System Event Notifications.

          Remove an Event Notification Channel

          You can remove channels if they are not part of an existing notification configuration.

          To remove notification channels:

          1. Expand the Menu and select Notification Channels.

          2. Select the tab for the type of channel you want to remove (Slack, PagerDuty, Syslog, Email).

          3. Select one or more channels.

          4. Click .

          To remove notification channels, run:

          netq del notification channel <text-channel-name-anchor>
          

          This example removes a Slack integration and verifies it is no longer in the configuration:

          cumulus@switch:~$ netq del notification channel slk-netq-events
          
          cumulus@switch:~$ netq show notification channel
          Matching config_notify records:
          Name            Type             Severity         Channel Info
          --------------- ---------------- ---------------- ------------------------
          pd-netq-events  pagerduty        info             integration-key: 1234567
                                                              890
          

          Delete an Event Notification Rule

          You might find after some experience with a given rule that you want to edit or remove the rule to better meet your needs. You can remove rules if they are not part of an existing notification configuration using the NetQ CLI.

          To remove notification rules, run:

          netq del notification rule <text-rule-name-anchor>
          

          This example removes a rule named swp52 and verifies it is no longer in the configuration:

          cumulus@switch:~$ netq del notification rule swp52
          
          cumulus@switch:~$ netq show notification rule
          Matching config_notify records:
          Name            Rule Key         Rule Value
          --------------- ---------------- --------------------
          bgpHostname     hostname         spine-01
          evpnVni         vni              42
          overTemp        new_s_crit       24
          svcStatus       new_status       down
          switchLeaf04    hostname         leaf04
          sysconf         configdiff       updated
          

          Delete an Event Notification Filter

          To delete notification filters, run:

          netq del notification filter <text-filter-name-anchor>
          

          Delete an Event Notification Proxy

          You can remove the proxy server by running the netq del notification proxy command. This changes the NetQ behavior to send events directly to the notification channels.

          Monitor Container Environments Using Kubernetes API Server

          The NetQ Agent monitors many aspects of containers on your network by integrating with the Kubernetes API server. In particular, the NetQ Agent tracks:

          This topic assumes a reasonable familiarity with Kubernetes terminology and architecture.

          Use NetQ with Kubernetes Clusters

          The NetQ Agent interfaces with the Kubernetes API server and listens to Kubernetes events. The NetQ Agent monitors network identity and physical network connectivity of Kubernetes resources like pods, daemon sets, services, and so forth. NetQ works with any container network interface (CNI), such as Calico or Flannel.

          The NetQ Kubernetes integration enables network administrators to:

          NetQ also helps network administrators identify changes within a Kubernetes cluster and determine if such changes had an adverse effect on the network performance (caused by a noisy neighbor for example). Additionally, NetQ helps the infrastructure administrator determine the distribution of Kubernetes workloads within a network.

          Requirements

          The NetQ Agent supports Kubernetes version 1.9.2 or later.

          Command Summary

          A large set of commands are available to monitor Kubernetes configurations, including the ability to monitor clusters, nodes, daemon-set, deployment, pods, replication, and services. Run netq show kubernetes help to see all the possible commands.

          netq [<hostname>] show kubernetes cluster [name <kube-cluster-name>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes node [components] [name <kube-node-name>] [cluster <kube-cluster-name> ] [label <kube-node-label>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes daemon-set [name <kube-ds-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-ds-label>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes daemon-set [name <kube-ds-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-ds-label>] connectivity [around <text-time>] [json]
          netq [<hostname>] show kubernetes deployment [name <kube-deployment-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-deployment-label>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes deployment [name <kube-deployment-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-deployment-label>] connectivity [around <text-time>] [json]
          netq [<hostname>] show kubernetes pod [name <kube-pod-name>] [cluster <kube-cluster-name> ] [namespace <namespace>] [label <kube-pod-label>] [pod-ip <kube-pod-ipaddress>] [node <kube-node-name>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes replication-controller [name <kube-rc-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-rc-label>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes replica-set [name <kube-rs-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-rs-label>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes replica-set [name <kube-rs-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-rs-label>] connectivity [around <text-time>] [json]
          netq [<hostname>] show kubernetes service [name <kube-service-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-service-label>] [service-cluster-ip <kube-service-cluster-ip>] [service-external-ip <kube-service-external-ip>] [around <text-time>] [json]
          netq [<hostname>] show kubernetes service [name <kube-service-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-service-label>] [service-cluster-ip <kube-service-cluster-ip>] [service-external-ip <kube-service-external-ip>] connectivity [around <text-time>] [json]
          netq <hostname> show impact kubernetes service [master <kube-master-node>] [name <kube-service-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-service-label>] [service-cluster-ip <kube-service-cluster-ip>] [service-external-ip <kube-service-external-ip>] [around <text-time>] [json]
          netq <hostname> show impact kubernetes replica-set [master <kube-master-node>] [name <kube-rs-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-rs-label>] [around <text-time>] [json]
          netq <hostname> show impact kubernetes deployment [master <kube-master-node>] [name <kube-deployment-name>] [cluster <kube-cluster-name>] [namespace <namespace>] [label <kube-deployment-label>] [around <text-time>] [json]
          netq config add agent kubernetes-monitor [poll-period <text-duration-period>]
          netq config del agent kubernetes-monitor
          netq config show agent kubernetes-monitor [json]
          

          Enable Kubernetes Monitoring

          For Kubernetes monitoring, the NetQ Agent must be installed, running, and enabled on the hosts providing the Kubernetes service.

          To enable NetQ Agent monitoring of the containers using the Kubernetes API, you must configure the following on the Kubernetes master node:

          1. Install and configure the NetQ Agent and CLI on the master node.

            Follow the steps outlined in Install NetQ Agents and Install NetQ CLI.

          2. Enable Kubernetes monitoring by the NetQ Agent on the master node.

            You can specify a polling period between 10 and 120 seconds; 15 seconds is the default.

            cumulus@host:~$ netq config add agent kubernetes-monitor poll-period 20
            Successfully added kubernetes monitor. Please restart netq-agent.
            
          3. Restart the NetQ agent.

            cumulus@host:~$ netq config restart agent
            
          4. After waiting for a minute, run the show command to view the cluster.

            cumulus@host:~$netq show kubernetes cluster
            
          5. Next, you must enable the NetQ Agent on every worker node for complete insight into your container network. Repeat steps 2 and 3 on each worker node.

          View Status of Kubernetes Clusters

          Run the netq show kubernetes cluster command to view the status of all Kubernetes clusters in the fabric. The following example shows two clusters; one with server11 as the master server and the other with server12 as the master server. Both are healthy and both list their associated worker nodes.

          cumulus@host:~$ netq show kubernetes cluster
          Matching kube_cluster records:
          Master                   Cluster Name     Controller Status    Scheduler Status Nodes
          ------------------------ ---------------- -------------------- ---------------- --------------------
          server11:3.0.0.68        default          Healthy              Healthy          server11 server13 se
                                                                                          rver22 server11 serv
                                                                                          er12 server23 server
                                                                                          24
          server12:3.0.0.69        default          Healthy              Healthy          server12 server21 se
                                                                                          rver23 server13 serv
                                                                                          er14 server21 server
                                                                                          22
          

          For deployments with multiple clusters, you can use the hostname option to filter the output. This example shows filtering of the list by server11:

          cumulus@host:~$ netq server11 show kubernetes cluster
          Matching kube_cluster records:
          Master                   Cluster Name     Controller Status    Scheduler Status Nodes
          ------------------------ ---------------- -------------------- ---------------- --------------------
          server11:3.0.0.68        default          Healthy              Healthy          server11 server13 se
                                                                                          rver22 server11 serv
                                                                                          er12 server23 server
                                                                                          24
          

          Optionally, use the json option to present the results in JSON format.

          cumulus@host:~$ netq show kubernetes cluster json
          {
              "kube_cluster":[
                  {
                      "clusterName":"default",
                      "schedulerStatus":"Healthy",
                      "master":"server12:3.0.0.69",
                      "nodes":"server12 server21 server23 server13 server14 server21 server22",
                      "controllerStatus":"Healthy"
                  },
                  {
                      "clusterName":"default",
                      "schedulerStatus":"Healthy",
                      "master":"server11:3.0.0.68",
                      "nodes":"server11 server13 server22 server11 server12 server23 server24",
                      "controllerStatus":"Healthy"
              }
              ],
              "truncatedResult":false
          }
          

          View Changes to a Cluster

          If data collection from the NetQ Agents is not occurring as it did previously, verify that no changes made to the Kubernetes cluster configuration use the around option. Be sure to include the unit of measure with the around value. Valid units include:

          This example shows changes that made to the cluster in the last hour. This example shows the addition of the two master nodes and the various worker nodes for each cluster.

          cumulus@host:~$ netq show kubernetes cluster around 1h
          Matching kube_cluster records:
          Master                   Cluster Name     Controller Status    Scheduler Status Nodes                                    DBState  Last changed
          ------------------------ ---------------- -------------------- ---------------- ---------------------------------------- -------- -------------------------
          server11:3.0.0.68        default          Healthy              Healthy          server11 server13 server22 server11 serv Add      Fri Feb  8 01:50:50 2019
                                                                                          er12 server23 server24
          server12:3.0.0.69        default          Healthy              Healthy          server12 server21 server23 server13 serv Add      Fri Feb  8 01:50:50 2019
                                                                                          er14 server21 server22
          server12:3.0.0.69        default          Healthy              Healthy          server12 server21 server23 server13      Add      Fri Feb  8 01:50:50 2019
          server11:3.0.0.68        default          Healthy              Healthy          server11                                 Add      Fri Feb  8 01:50:50 2019
          server12:3.0.0.69        default          Healthy              Healthy          server12                                 Add      Fri Feb  8 01:50:50 2019
          

          View Kubernetes Pod Information

          You can show configuration and status of the pods in a cluster, including the names, labels, addresses, associated cluster and containers, and whether the pod is running. This example shows pods for FRR, nginx, Calico, and various Kubernetes components sorted by master node.

          cumulus@host:~$ netq show kubernetes pod
          Matching kube_pod records:
          Master                   Namespace    Name                 IP               Node         Labels               Status   Containers               Last Changed
          ------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
          server11:3.0.0.68        default      cumulus-frr-8vssx    3.0.0.70         server13     pod-template-generat Running  cumulus-frr:f8cac70bb217 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      cumulus-frr-dkkgp    3.0.5.135        server24     pod-template-generat Running  cumulus-frr:577a60d5f40c Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      cumulus-frr-f4bgx    3.0.3.196        server11     pod-template-generat Running  cumulus-frr:1bc73154a9f5 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      cumulus-frr-gqqxn    3.0.2.5          server22     pod-template-generat Running  cumulus-frr:3ee0396d126a Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      cumulus-frr-kdh9f    3.0.3.197        server12     pod-template-generat Running  cumulus-frr:94b6329ecb50 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      cumulus-frr-mvv8m    3.0.5.134        server23     pod-template-generat Running  cumulus-frr:b5845299ce3c Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server11:3.0.0.68        default      httpd-5456469bfd-bq9 10.244.49.65     server22     app:httpd            Running  httpd:79b7f532be2d       Fri Feb  8 01:50:50 2019
                                                zm
          server11:3.0.0.68        default      influxdb-6cdb566dd-8 10.244.162.128   server13     app:influx           Running  influxdb:15dce703cdec    Fri Feb  8 01:50:50 2019
                                                9lwn
          server11:3.0.0.68        default      nginx-8586cf59-26pj5 10.244.9.193     server24     run:nginx            Running  nginx:6e2b65070c86       Fri Feb  8 01:50:50 2019
          server11:3.0.0.68        default      nginx-8586cf59-c82ns 10.244.40.128    server12     run:nginx            Running  nginx:01b017c26725       Fri Feb  8 01:50:50 2019
          server11:3.0.0.68        default      nginx-8586cf59-wjwgp 10.244.49.64     server22     run:nginx            Running  nginx:ed2b4254e328       Fri Feb  8 01:50:50 2019
          server11:3.0.0.68        kube-system  calico-etcd-pfg9r    3.0.0.68         server11     k8s-app:calico-etcd  Running  calico-etcd:f95f44b745a7 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:142071906
                                                                                                   5
          server11:3.0.0.68        kube-system  calico-kube-controll 3.0.2.5          server22     k8s-app:calico-kube- Running  calico-kube-controllers: Fri Feb  8 01:50:50 2019
                                                ers-d669cc78f-4r5t2                                controllers                   3688b0c5e9c5
          server11:3.0.0.68        kube-system  calico-node-4px69    3.0.2.5          server22     k8s-app:calico-node  Running  calico-node:1d01648ebba4 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:da350802a3d2
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-bt8w6    3.0.3.196        server11     k8s-app:calico-node  Running  calico-node:9b3358a07e5e Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:d38713e6fdd8
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-gtmkv    3.0.3.197        server12     k8s-app:calico-node  Running  calico-node:48fcc6c40a6b Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:f0838a313eff
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-mvslq    3.0.5.134        server23     k8s-app:calico-node  Running  calico-node:7b361aece76c Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:f2da6bc36bf8
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-sjj2s    3.0.5.135        server24     k8s-app:calico-node  Running  calico-node:6e13b2b73031 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:fa4b2b17fba9
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-vdkk5    3.0.0.70         server13     k8s-app:calico-node  Running  calico-node:fb3ec9429281 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:b56980da7294
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  calico-node-zzfkr    3.0.0.68         server11     k8s-app:calico-node  Running  calico-node:c1ac399dd862 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:60a779fdc47a
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  etcd-server11        3.0.0.68         server11     tier:control-plane c Running  etcd:dde63d44a2f5        Fri Feb  8 01:50:50 2019
                                                                                                   omponent:etcd
          server11:3.0.0.68        kube-system  kube-apiserver-hostd 3.0.0.68         server11     tier:control-plane c Running  kube-apiserver:0cd557bbf Fri Feb  8 01:50:50 2019
                                                -11                                                omponent:kube-apiser          2fe
                                                                                                   ver
          server11:3.0.0.68        kube-system  kube-controller-mana 3.0.0.68         server11     tier:control-plane c Running  kube-controller-manager: Fri Feb  8 01:50:50 2019
                                                ger-server11                                       omponent:kube-contro          89b2323d09b2
                                                                                                   ller-manager
          server11:3.0.0.68        kube-system  kube-dns-6f4fd4bdf-p 10.244.34.64     server23     k8s-app:kube-dns     Running  dnsmasq:284d9d363999 kub Fri Feb  8 01:50:50 2019
                                                lv7p                                                                             edns:bd8bdc49b950 sideca
                                                                                                                                 r:fe10820ffb19
          server11:3.0.0.68        kube-system  kube-proxy-4cx2t     3.0.3.197        server12     k8s-app:kube-proxy p Running  kube-proxy:49b0936a4212  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-7674k     3.0.3.196        server11     k8s-app:kube-proxy p Running  kube-proxy:5dc2f5fe0fad  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-ck5cn     3.0.2.5          server22     k8s-app:kube-proxy p Running  kube-proxy:6944f7ff8c18  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-f9dt8     3.0.0.68         server11     k8s-app:kube-proxy p Running  kube-proxy:032cc82ef3f8  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-j6qw6     3.0.5.135        server24     k8s-app:kube-proxy p Running  kube-proxy:10544e43212e  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-lq8zz     3.0.5.134        server23     k8s-app:kube-proxy p Running  kube-proxy:1bcfa09bb186  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-proxy-vg7kj     3.0.0.70         server13     k8s-app:kube-proxy p Running  kube-proxy:8fed384b68e5  Fri Feb  8 01:50:50 2019
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-scheduler-hostd 3.0.0.68         server11     tier:control-plane c Running  kube-scheduler:c262a8071 Fri Feb  8 01:50:50 2019
                                                -11                                                omponent:kube-schedu          3cb
                                                                                                   ler
          server12:3.0.0.69        default      cumulus-frr-2gkdv    3.0.2.4          server21     pod-template-generat Running  cumulus-frr:25d1109f8898 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        default      cumulus-frr-b9dm5    3.0.3.199        server14     pod-template-generat Running  cumulus-frr:45063f9a095f Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        default      cumulus-frr-rtqhv    3.0.2.6          server23     pod-template-generat Running  cumulus-frr:63e802a52ea2 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        default      cumulus-frr-tddrg    3.0.5.133        server22     pod-template-generat Running  cumulus-frr:52dd54e4ac9f Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        default      cumulus-frr-vx7jp    3.0.5.132        server21     pod-template-generat Running  cumulus-frr:1c20addfcbd3 Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        default      cumulus-frr-x7ft5    3.0.3.198        server13     pod-template-generat Running  cumulus-frr:b0f63792732e Fri Feb  8 01:50:50 2019
                                                                                                   ion:1 name:cumulus-f
                                                                                                   rr controller-revisi
                                                                                                   on-hash:3710533951
          server12:3.0.0.69        kube-system  calico-etcd-btqgt    3.0.0.69         server12     k8s-app:calico-etcd  Running  calico-etcd:72b1a16968fb Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:142071906
                                                                                                   5
          server12:3.0.0.69        kube-system  calico-kube-controll 3.0.5.132        server21     k8s-app:calico-kube- Running  calico-kube-controllers: Fri Feb  8 01:50:50 2019
                                                ers-d669cc78f-bdnzk                                controllers                   6821bf04696f
          server12:3.0.0.69        kube-system  calico-node-4g6vd    3.0.3.198        server13     k8s-app:calico-node  Running  calico-node:1046b559a50c Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:0a136851da17
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:490828062
          server12:3.0.0.69        kube-system  calico-node-4hg6l    3.0.0.69         server12     k8s-app:calico-node  Running  calico-node:4e7acc83f8e8 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:a26e76de289e
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:490828062
          server12:3.0.0.69        kube-system  calico-node-4p66v    3.0.2.6          server23     k8s-app:calico-node  Running  calico-node:a7a44072e4e2 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:9a19da2b2308
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:490828062
          server12:3.0.0.69        kube-system  calico-node-5z7k4    3.0.5.133        server22     k8s-app:calico-node  Running  calico-node:9878b0606158 Fri Feb  8 01:50:50 2019
                                                                                                   pod-template-generat          install-cni:489f8f326cf9
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:490828062
          ...
          

          You can filter this information to focus on pods on a particular node:

          cumulus@host:~$ netq show kubernetes pod node server11
          Matching kube_pod records:
          Master                   Namespace    Name                 IP               Node         Labels               Status   Containers               Last Changed
          ------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
          server11:3.0.0.68        kube-system  calico-etcd-pfg9r    3.0.0.68         server11     k8s-app:calico-etcd  Running  calico-etcd:f95f44b745a7 2d:14h:0m:59s
                                                                                                   pod-template-generat
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:142071906
                                                                                                   5
          server11:3.0.0.68        kube-system  calico-node-zzfkr    3.0.0.68         server11     k8s-app:calico-node  Running  calico-node:c1ac399dd862 2d:14h:0m:59s
                                                                                                   pod-template-generat          install-cni:60a779fdc47a
                                                                                                   ion:1 controller-rev
                                                                                                   ision-hash:324404111
                                                                                                   9
          server11:3.0.0.68        kube-system  etcd-server11        3.0.0.68         server11     tier:control-plane c Running  etcd:dde63d44a2f5        2d:14h:1m:44s
                                                                                                   omponent:etcd
          server11:3.0.0.68        kube-system  kube-apiserver-serve 3.0.0.68         server11     tier:control-plane c Running  kube-apiserver:0cd557bbf 2d:14h:1m:44s
                                                r11                                                omponent:kube-apiser          2fe
                                                                                                   ver
          server11:3.0.0.68        kube-system  kube-controller-mana 3.0.0.68         server11     tier:control-plane c Running  kube-controller-manager: 2d:14h:1m:44s
                                                ger-server11                                       omponent:kube-contro          89b2323d09b2
                                                                                                   ller-manager
          server11:3.0.0.68        kube-system  kube-proxy-f9dt8     3.0.0.68         server11     k8s-app:kube-proxy p Running  kube-proxy:032cc82ef3f8  2d:14h:0m:59s
                                                                                                   od-template-generati
                                                                                                   on:1 controller-revi
                                                                                                   sion-hash:3953509896
          server11:3.0.0.68        kube-system  kube-scheduler-serve 3.0.0.68         server11     tier:control-plane c Running  kube-scheduler:c262a8071 2d:14h:1m:44s
                                                r11                                                omponent:kube-schedu          3cb
                                                                                                   ler
          

          View Kubernetes Node Information

          You can view detailed information about a node, including their role in the cluster, pod CIDR and kubelet status. This example shows all the nodes in the cluster with server11 as the master. Note that server11 acts as a worker node along with the other nodes in the cluster, server12, server13, server22, server23, and server24.

          cumulus@host:~$ netq server11 show kubernetes node
          Matching kube_cluster records:
          Master                   Cluster Name     Node Name            Role       Status           Labels               Pod CIDR                 Last Changed
          ------------------------ ---------------- -------------------- ---------- ---------------- -------------------- ------------------------ ----------------
          server11:3.0.0.68        default          server11             master     KubeletReady     node-role.kubernetes 10.224.0.0/24            14h:23m:46s
                                                                                                     .io/master: kubernet
                                                                                                     es.io/hostname:hostd
                                                                                                     -11 beta.kubernetes.
                                                                                                     io/arch:amd64 beta.k
                                                                                                     ubernetes.io/os:linu
                                                                                                     x
          server11:3.0.0.68        default          server13             worker     KubeletReady     kubernetes.io/hostna 10.224.3.0/24            14h:19m:56s
                                                                                                     me:server13 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          server11:3.0.0.68        default          server22             worker     KubeletReady     kubernetes.io/hostna 10.224.1.0/24            14h:24m:31s
                                                                                                     me:server22 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          server11:3.0.0.68        default          server11             worker     KubeletReady     kubernetes.io/hostna 10.224.2.0/24            14h:24m:16s
                                                                                                     me:server11 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          server11:3.0.0.68        default          server12             worker     KubeletReady     kubernetes.io/hostna 10.224.4.0/24            14h:24m:16s
                                                                                                     me:server12 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          server11:3.0.0.68        default          server23             worker     KubeletReady     kubernetes.io/hostna 10.224.5.0/24            14h:24m:16s
                                                                                                     me:server23 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          server11:3.0.0.68        default          server24             worker     KubeletReady     kubernetes.io/hostna 10.224.6.0/24            14h:24m:1s
                                                                                                     me:server24 beta.kub
                                                                                                     ernetes.io/arch:amd6
                                                                                                     4 beta.kubernetes.io
                                                                                                     /os:linux
          

          To display the kubelet or Docker version, use the components option with the show command. This example lists the kublet version, a proxy address if used, and the status of the container for server11 master and worker nodes.

          cumulus@host:~$ netq server11 show kubernetes node components
          Matching kube_cluster records:
                                   Master           Cluster Name         Node Name    Kubelet      KubeProxy         Container Runt
                                                                                                                     ime
          ------------------------ ---------------- -------------------- ------------ ------------ ----------------- --------------
          server11:3.0.0.68        default          server11             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server13             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server22             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server11             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server12             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server23             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          server11:3.0.0.68        default          server24             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          

          To view only the details for a selected node, the name option with the hostname of that node following the components option:

          cumulus@host:~$ netq server11 show kubernetes node components name server13
          Matching kube_cluster records:
                                   Master           Cluster Name         Node Name    Kubelet      KubeProxy         Container Runt
                                                                                                                     ime
          ------------------------ ---------------- -------------------- ------------ ------------ ----------------- --------------
          server11:3.0.0.68        default          server13             v1.9.2       v1.9.2       docker://17.3.2   KubeletReady
          

          View Kubernetes Replica Set on a Node

          You can view information about the replica set, including the name, labels, and number of replicas present for each application. This example shows the number of replicas for each application in the server11 cluster:

          cumulus@host:~$ netq server11 show kubernetes replica-set
          Matching kube_replica records:
          Master                   Cluster Name Namespace        Replication Name               Labels               Replicas                           Ready Replicas Last Changed
          ------------------------ ------------ ---------------- ------------------------------ -------------------- ---------------------------------- -------------- ----------------
          server11:3.0.0.68        default      default          influxdb-6cdb566dd             app:influx           1                                  1              14h:19m:28s
          server11:3.0.0.68        default      default          nginx-8586cf59                 run:nginx            3                                  3              14h:24m:39s
          server11:3.0.0.68        default      default          httpd-5456469bfd               app:httpd            1                                  1              14h:19m:28s
          server11:3.0.0.68        default      kube-system      kube-dns-6f4fd4bdf             k8s-app:kube-dns     1                                  1              14h:27m:9s
          server11:3.0.0.68        default      kube-system      calico-kube-controllers-d669cc k8s-app:calico-kube- 1                                  1              14h:27m:9s
                                                                 78f                            controllers
          

          View the Daemon-sets on a Node

          You can view information about the daemon set running on the node. This example shows that six copies of the cumulus-frr daemon are running on the server11 node:

          cumulus@host:~$ netq server11 show kubernetes daemon-set namespace default
          Matching kube_daemonset records:
          Master                   Cluster Name Namespace        Daemon Set Name                Labels               Desired Count Ready Count Last Changed
          ------------------------ ------------ ---------------- ------------------------------ -------------------- ------------- ----------- ----------------
          server11:3.0.0.68        default      default          cumulus-frr                    k8s-app:cumulus-frr  6             6           14h:25m:37s
          

          View Pods on a Node

          You can view information about the pods on the node. The first example shows all pods running nginx in the default namespace for the server11 cluster. The second example shows all pods running any application in the default namespace for the server11 cluster.

          cumulus@host:~$ netq server11 show kubernetes pod namespace default label nginx
          Matching kube_pod records:
          Master                   Namespace    Name                 IP               Node         Labels               Status   Containers               Last Changed
          ------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
          server11:3.0.0.68        default      nginx-8586cf59-26pj5 10.244.9.193     server24     run:nginx            Running  nginx:6e2b65070c86       14h:25m:24s
          server11:3.0.0.68        default      nginx-8586cf59-c82ns 10.244.40.128    server12     run:nginx            Running  nginx:01b017c26725       14h:25m:24s
          server11:3.0.0.68        default      nginx-8586cf59-wjwgp 10.244.49.64     server22     run:nginx            Running  nginx:ed2b4254e328       14h:25m:24s
           
          cumulus@host:~$ netq server11 show kubernetes pod namespace default label app
          Matching kube_pod records:
          Master                   Namespace    Name                 IP               Node         Labels               Status   Containers               Last Changed
          ------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
          server11:3.0.0.68        default      httpd-5456469bfd-bq9 10.244.49.65     server22     app:httpd            Running  httpd:79b7f532be2d       14h:20m:34s
                                                zm
          server11:3.0.0.68        default      influxdb-6cdb566dd-8 10.244.162.128   server13     app:influx           Running  influxdb:15dce703cdec    14h:20m:34s
                                                9lwn
          

          View Status of the Replication Controller on a Node

          After you create the replicas, you can then view information about the replication controller:

          cumulus@host:~$ netq server11 show kubernetes replication-controller
          No matching kube_replica records found
          

          View Kubernetes Deployment Information

          For each depolyment, you can view the number of replicas associated with an application. This example shows information for a deployment of the nginx application:

          cumulus@host:~$ netq server11 show kubernetes deployment name nginx
          Matching kube_deployment records:
          Master                   Namespace       Name                 Replicas                           Ready Replicas Labels                         Last Changed
          ------------------------ --------------- -------------------- ---------------------------------- -------------- ------------------------------ ----------------
          server11:3.0.0.68        default         nginx                3                                  3              run:nginx                      14h:27m:20s
          

          Search Using Labels

          You can search for information about your Kubernetes clusters using labels. A label search is similar to a “contains” regular expression search. The following example looks for all nodes that contain kube in the replication set name or label:

          cumulus@host:~$ netq server11 show kubernetes replica-set label kube
          Matching kube_replica records:
          Master                   Cluster Name Namespace        Replication Name               Labels               Replicas                           Ready Replicas Last Changed
          ------------------------ ------------ ---------------- ------------------------------ -------------------- ---------------------------------- -------------- ----------------
          server11:3.0.0.68        default      kube-system      kube-dns-6f4fd4bdf             k8s-app:kube-dns     1                                  1              14h:30m:41s
          server11:3.0.0.68        default      kube-system      calico-kube-controllers-d669cc k8s-app:calico-kube- 1                                  1              14h:30m:41s
                                                                 78f                            controllers
          

          View Container Connectivity

          You can view the connectivity graph of a Kubernetes pod, seeing its replica set, deployment or service level. The connectivity graph starts with the server where you deployed the pod, and shows the peer for each server interface. This data appears in a similar manner as the netq trace command, showing the interface name, the outbound port on that interface, and the inbound port on the peer.

          In this example shows connectivity at the deployment level, where the nginx-8586cf59-wjwgp replica is in a pod on the server22 node. It has four possible communication paths, through interfaces swp1-4 out varying ports to peer interfaces swp7 and swp20 on torc-21, torc-22, edge01 and edge02 nodes. Similarly, it shows the connections for two additional nginx replicas.

          cumulus@host:~$ netq server11 show kubernetes deployment name nginx connectivity
          nginx -- nginx-8586cf59-wjwgp -- server22:swp1:torbond1 -- swp7:hostbond3:torc-21
                                        -- server22:swp2:torbond1 -- swp7:hostbond3:torc-22
                                        -- server22:swp3:NetQBond-2 -- swp20:NetQBond-20:edge01
                                        -- server22:swp4:NetQBond-2 -- swp20:NetQBond-20:edge02
                -- nginx-8586cf59-c82ns -- server12:swp2:NetQBond-1 -- swp23:NetQBond-23:edge01
                                        -- server12:swp3:NetQBond-1 -- swp23:NetQBond-23:edge02
                                        -- server12:swp1:swp1 -- swp6:VlanA-1:tor-1
                -- nginx-8586cf59-26pj5 -- server24:swp2:NetQBond-1 -- swp29:NetQBond-29:edge01
                                        -- server24:swp3:NetQBond-1 -- swp29:NetQBond-29:edge02
                                        -- server24:swp1:swp1 -- swp8:VlanA-1:tor-2
          

          View Kubernetes Services Information

          You can show details about the Kubernetes services in a cluster, including service name, labels associated with the service, type of service, associated IP address, an external address if a public service, and ports used. This example shows the services available in the Kubernetes cluster:

          cumulus@host:~$ netq show kubernetes service
          Matching kube_service records:
          Master                   Namespace        Service Name         Labels       Type       Cluster IP       External IP      Ports                               Last Changed
          ------------------------ ---------------- -------------------- ------------ ---------- ---------------- ---------------- ----------------------------------- ----------------
          server11:3.0.0.68        default          kubernetes                        ClusterIP  10.96.0.1                         TCP:443                             2d:13h:45m:30s
          server11:3.0.0.68        kube-system      calico-etcd          k8s-app:cali ClusterIP  10.96.232.136                     TCP:6666                            2d:13h:45m:27s
                                                                         co-etcd
          server11:3.0.0.68        kube-system      kube-dns             k8s-app:kube ClusterIP  10.96.0.10                        UDP:53 TCP:53                       2d:13h:45m:28s
                                                                         -dns
          server12:3.0.0.69        default          kubernetes                        ClusterIP  10.96.0.1                         TCP:443                             2d:13h:46m:24s
          server12:3.0.0.69        kube-system      calico-etcd          k8s-app:cali ClusterIP  10.96.232.136                     TCP:6666                            2d:13h:46m:20s
                                                                         co-etcd
          server12:3.0.0.69        kube-system      kube-dns             k8s-app:kube ClusterIP  10.96.0.10                        UDP:53 TCP:53                       2d:13h:46m:20s
                                                                         -dns
          

          You can filter the list to view details about a particular Kubernetes service using the name option, as shown here:

          cumulus@host:~$ netq show kubernetes service name calico-etcd
          Matching kube_service records:
          Master                   Namespace        Service Name         Labels       Type       Cluster IP       External IP      Ports                               Last Changed
          ------------------------ ---------------- -------------------- ------------ ---------- ---------------- ---------------- ----------------------------------- ----------------
          server11:3.0.0.68        kube-system      calico-etcd          k8s-app:cali ClusterIP  10.96.232.136                     TCP:6666                            2d:13h:48m:10s
                                                                         co-etcd
          server12:3.0.0.69        kube-system      calico-etcd          k8s-app:cali ClusterIP  10.96.232.136                     TCP:6666                            2d:13h:49m:3s
                                                                         co-etcd
          

          View Kubernetes Service Connectivity

          To see the connectivity of a given Kubernetes service, include the connectivity option. This example shows the connectivity of the calico-etcd service:

          cumulus@host:~$ netq show kubernetes service name calico-etcd connectivity
          calico-etcd -- calico-etcd-pfg9r -- server11:swp1:torbond1 -- swp6:hostbond2:torc-11
                                           -- server11:swp2:torbond1 -- swp6:hostbond2:torc-12
                                           -- server11:swp3:NetQBond-2 -- swp16:NetQBond-16:edge01
                                           -- server11:swp4:NetQBond-2 -- swp16:NetQBond-16:edge02
          calico-etcd -- calico-etcd-btqgt -- server12:swp1:torbond1 -- swp7:hostbond3:torc-11
                                           -- server12:swp2:torbond1 -- swp7:hostbond3:torc-12
                                           -- server12:swp3:NetQBond-2 -- swp17:NetQBond-17:edge01
                                           -- server12:swp4:NetQBond-2 -- swp17:NetQBond-17:edge02
          

          View the Impact of Connectivity Loss for a Service

          You can preview the impact on the service availability based on the loss of particular node using the impact option. The output is color coded (not shown in the example below) so you can clearly see the impact: green shows no impact, yellow shows partial impact, and red shows full impact.

          cumulus@host:~$ netq server11 show impact kubernetes service name calico-etcd
          calico-etcd -- calico-etcd-pfg9r -- server11:swp1:torbond1 -- swp6:hostbond2:torc-11
                                           -- server11:swp2:torbond1 -- swp6:hostbond2:torc-12
                                           -- server11:swp3:NetQBond-2 -- swp16:NetQBond-16:edge01
                                           -- server11:swp4:NetQBond-2 -- swp16:NetQBond-16:edge02
          

          View Kubernetes Cluster Configuration in the Past

          You can use the around option to go back in time to check the network status and identify any changes that occurred on the network.

          This example shows the current state of the network. Notice there is a node named server23. server23 is there because the node server22 went down and Kubernetes spun up a third replica on a different host to satisfy the deployment requirement.

          cumulus@host:~$ netq server11 show kubernetes deployment name nginx connectivity
          nginx -- nginx-8586cf59-fqtnj -- server12:swp2:NetQBond-1 -- swp23:NetQBond-23:edge01
                                        -- server12:swp3:NetQBond-1 -- swp23:NetQBond-23:edge02
                                        -- server12:swp1:swp1 -- swp6:VlanA-1:tor-1
                -- nginx-8586cf59-8g487 -- server24:swp2:NetQBond-1 -- swp29:NetQBond-29:edge01
                                        -- server24:swp3:NetQBond-1 -- swp29:NetQBond-29:edge02
                                        -- server24:swp1:swp1 -- swp8:VlanA-1:tor-2
                -- nginx-8586cf59-2hb8t -- server23:swp1:swp1 -- swp7:VlanA-1:tor-2
                                        -- server23:swp2:NetQBond-1 -- swp28:NetQBond-28:edge01
                                        -- server23:swp3:NetQBond-1 -- swp28:NetQBond-28:edge02
          

          You can see this by going back in time 10 minutes. server23 was not present, whereas server22 was present:

          cumulus@host:~$ netq server11 show kubernetes deployment name nginx connectivity around 10m
          nginx -- nginx-8586cf59-fqtnj -- server12:swp2:NetQBond-1 -- swp23:NetQBond-23:edge01
                                        -- server12:swp3:NetQBond-1 -- swp23:NetQBond-23:edge02
                                        -- server12:swp1:swp1 -- swp6:VlanA-1:tor-1
                -- nginx-8586cf59-2xxs4 -- server22:swp1:torbond1 -- swp7:hostbond3:torc-21
                                        -- server22:swp2:torbond1 -- swp7:hostbond3:torc-22
                                        -- server22:swp3:NetQBond-2 -- swp20:NetQBond-20:edge01
                                        -- server22:swp4:NetQBond-2 -- swp20:NetQBond-20:edge02
                -- nginx-8586cf59-8g487 -- server24:swp2:NetQBond-1 -- swp29:NetQBond-29:edge01
                                        -- server24:swp3:NetQBond-1 -- swp29:NetQBond-29:edge02
                                        -- server24:swp1:swp1 -- swp8:VlanA-1:tor-2
          

          View the Impact of Connectivity Loss for a Deployment

          You can determine the impact on the Kubernetes deployment in the event a host or switch goes down. The output is color coded (not shown in the example below) so you can clearly see the impact: green shows no impact, yellow shows partial impact, and red shows full impact.

          cumulus@host:~$ netq torc-21 show impact kubernetes deployment name nginx
          nginx -- nginx-8586cf59-wjwgp -- server22:swp1:torbond1 -- swp7:hostbond3:torc-21
                                        -- server22:swp2:torbond1 -- swp7:hostbond3:torc-22
                                        -- server22:swp3:NetQBond-2 -- swp20:NetQBond-20:edge01
                                        -- server22:swp4:NetQBond-2 -- swp20:NetQBond-20:edge02
                -- nginx-8586cf59-c82ns -- server12:swp2:NetQBond-1 -- swp23:NetQBond-23:edge01
                                        -- server12:swp3:NetQBond-1 -- swp23:NetQBond-23:edge02
                                        -- server12:swp1:swp1 -- swp6:VlanA-1:tor-1
                -- nginx-8586cf59-26pj5 -- server24:swp2:NetQBond-1 -- swp29:NetQBond-29:edge01
                                        -- server24:swp3:NetQBond-1 -- swp29:NetQBond-29:edge02
                                        -- server24:swp1:swp1 -- swp8:VlanA-1:tor-2
          cumulus@server11:~$ netq server12 show impact kubernetes deployment name nginx
          nginx -- nginx-8586cf59-wjwgp -- server22:swp1:torbond1 -- swp7:hostbond3:torc-21
                                        -- server22:swp2:torbond1 -- swp7:hostbond3:torc-22
                                        -- server22:swp3:NetQBond-2 -- swp20:NetQBond-20:edge01
                                        -- server22:swp4:NetQBond-2 -- swp20:NetQBond-20:edge02
                -- nginx-8586cf59-c82ns -- server12:swp2:NetQBond-1 -- swp23:NetQBond-23:edge01
                                        -- server12:swp3:NetQBond-1 -- swp23:NetQBond-23:edge02
                                        -- server12:swp1:swp1 -- swp6:VlanA-1:tor-1
                -- nginx-8586cf59-26pj5 -- server24:swp2:NetQBond-1 -- swp29:NetQBond-29:edge01
                                        -- server24:swp3:NetQBond-1 -- swp29:NetQBond-29:edge02
          

          Kubernetes Cluster Maintenance

          If you need to perform maintenance on the Kubernetes cluster itself, use the following commands to bring the cluster down and then back up.

          1. Display the list of all the nodes in the Kubernetes cluster:

            cumulus@host:~$ kubectl get nodes 
            
          2. Tell Kubernetes to drain the node so that the pods running on it are gracefully scheduled elsewhere:

            cumulus@host:~$ kubectl drain <node name> 
            
          3. After the maintenance window is over, put the node back into the cluster so that Kubernetes can start scheduling pods on it again:

            cumulus@host:~$ kubectl uncordon <node name>
            

          Configure Threshold-Based Event Notifications

          NetQ supports TCA events, which are a set of events that trigger at the crossing of a user-defined threshold. These events detect and prevent network failures for selected ACL resources, digital optics, forwarding resources, interface errors and statistics, link flaps, resource utilization, and sensor events. You can find a complete list in the TCA Event Messages Reference.

          A notification configuration must contain one rule. Each rule must contain a scope and a threshold. If you want to deliver events to one or more notification channels (for example, email or Slack), create them by following the instructions in Create a Channel, and then return here to define your rule.

          If a rule is not associated with a channel, the event information is only reachable from the database.

          Define a Scope

          Scope parameters are used to filter events generated by a given rule. You can filter all rules by hostname, while other rules can be filtered by interface or event-specific parameters.

          Select Scope Parameters

          For each event type, you can filter rules according to the following parameters:

          Event IDScope Parameters
          TCA_TCAM_IN_ACL_V4_FILTER_UPPERHostname
          TCA_TCAM_EG_ACL_V4_FILTER_UPPERHostname
          TCA_TCAM_IN_ACL_V4_MANGLE_UPPERHostname
          TCA_TCAM_EG_ACL_V4_MANGLE_UPPERHostname
          TCA_TCAM_IN_ACL_V6_FILTER_UPPERHostname
          TCA_TCAM_EG_ACL_V6_FILTER_UPPERHostname
          TCA_TCAM_IN_ACL_V6_MANGLE_UPPERHostname
          TCA_TCAM_EG_ACL_V6_MANGLE_UPPERHostname
          TCA_TCAM_IN_ACL_8021x_FILTER_UPPERHostname
          TCA_TCAM_ACL_L4_PORT_CHECKERS_UPPERHostname
          TCA_TCAM_ACL_REGIONS_UPPERHostname
          TCA_TCAM_IN_ACL_MIRROR_UPPERHostname
          TCA_TCAM_ACL_18B_RULES_UPPERHostname
          TCA_TCAM_ACL_32B_RULES_UPPERHostname
          TCA_TCAM_ACL_54B_RULES_UPPERHostname
          TCA_TCAM_IN_PBR_V4_FILTER_UPPERHostname
          TCA_TCAM_IN_PBR_V6_FILTER_UPPERHostname
          Event IDScope Parameters
          TCA_DOM_RX_POWER_ALARM_UPPERHostname, Interface
          TCA_DOM_RX_POWER_ALARM_LOWERHostname, Interface
          TCA_DOM_RX_POWER_WARNING_UPPERHostname, Interface
          TCA_DOM_RX_POWER_WARNING_LOWERHostname, Interface
          TCA_DOM_BIAS_CURRENT_ALARM_UPPERHostname, Interface
          TCA_DOM_BIAS_CURRENT_ALARM_LOWERHostname, Interface
          TCA_DOM_BIAS_CURRENT_WARNING_UPPERHostname, Interface
          TCA_DOM_BIAS_CURRENT_WARNING_LOWERHostname, Interface
          TCA_DOM_OUTPUT_POWER_ALARM_UPPERHostname, Interface
          TCA_DOM_OUTPUT_POWER_ALARM_LOWERHostname, Interface
          TCA_DOM_OUTPUT_POWER_WARNING_UPPERHostname, Interface
          TCA_DOM_OUTPUT_POWER_WARNING_LOWERHostname, Interface
          TCA_DOM_MODULE_TEMPERATURE_ALARM_UPPERHostname, Interface
          TCA_DOM_MODULE_TEMPERATURE_ALARM_LOWERHostname, Interface
          TCA_DOM_MODULE_TEMPERATURE_WARNING_UPPERHostname, Interface
          TCA_DOM_MODULE_TEMPERATURE_WARNING_LOWERHostname, Interface
          TCA_DOM_MODULE_VOLTAGE_ALARM_UPPERHostname, Interface
          TCA_DOM_MODULE_VOLTAGE_ALARM_LOWERHostname, Interface
          TCA_DOM_MODULE_VOLTAGE_WARNING_UPPERHostname, Interface
          TCA_DOM_MODULE_VOLTAGE_WARNING_LOWERHostname, Interface
          Event IDScope Parameters
          TCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPERHostname
          TCA_TCAM_TOTAL_MCAST_ROUTES_UPPERHostname
          TCA_TCAM_MAC_ENTRIES_UPPERHostname
          TCA_TCAM_ECMP_NEXTHOPS_UPPERHostname
          TCA_TCAM_IPV4_ROUTE_UPPERHostname
          TCA_TCAM_IPV4_HOST_UPPERHostname
          TCA_TCAM_IPV6_ROUTE_UPPERHostname
          TCA_TCAM_IPV6_HOST_UPPERHostname
          Event IDScope Parameters
          TCA_HW_IF_OVERSIZE_ERRORSHostname, Interface
          TCA_HW_IF_UNDERSIZE_ERRORSHostname, Interface
          TCA_HW_IF_ALIGNMENT_ERRORSHostname, Interface
          TCA_HW_IF_JABBER_ERRORSHostname, Interface
          TCA_HW_IF_SYMBOL_ERRORSHostname, Interface
          Event IDScope Parameters
          TCA_RXBROADCAST_UPPERHostname, Interface
          TCA_RXBYTES_UPPERHostname, Interface
          TCA_RXMULTICAST_UPPERHostname, Interface
          TCA_TXBROADCAST_UPPERHostname, Interface
          TCA_TXBYTES_UPPERHostname, Interface
          TCA_TXMULTICAST_UPPERHostname, Interface
          Event IDScope Parameters
          TCA_LINKHostname, Interface
          Event IDScope Parameters
          TCA_CPU_UTILIZATION_UPPERHostname
          TCA_DISK_UTILIZATION_UPPERHostname
          TCA_MEMORY_UTILIZATION_UPPERHostname
          Event IDScope Parameters
          Tx CNP Unicast No Buffer DiscardHostname, Interface
          Rx RoCE PFC Pause DurationHostname
          Rx RoCE PG Usage CellsHostname, Interface
          Tx RoCE TC Usage CellsHostname, Interface
          Rx RoCE No Buffer DiscardHostname, Interface
          Tx RoCE PFC Pause DurationHostname, Interface
          Tx CNP Buffer Usage CellsHostname, Interface
          Tx ECN Marked PacketsHostname, Interface
          Tx RoCE PFC Pause PacketsHostname, Interface
          Rx CNP No Buffer DiscardHostname, Interface
          Rx CNP PG Usage CellsHostname, Interface
          Tx CNP TC Usage CellsHostname, Interface
          Rx RoCE Buffer Usage CellsHostname, Interface
          Tx RoCE Unicast No Buffer DiscardHostname, Interface
          Rx CNP Buffer Usage CellsHostname, Interface
          Rx RoCE PFC Pause PacketsHostname, Interface
          Tx RoCE Buffer Usage CellsHostname, Interface
          Event IDScope Parameters
          TCA_SENSOR_FAN_UPPERHostname, Sensor Name
          TCA_SENSOR_POWER_UPPERHostname, Sensor Name
          TCA_SENSOR_TEMPERATURE_UPPERHostname, Sensor Name
          TCA_SENSOR_VOLTAGE_UPPERHostname, Sensor Name
          Event IDScope Parameters
          TCA_WJH_DROP_AGG_UPPERHostname, Reason
          TCA_WJH_ACL_DROP_AGG_UPPERHostname, Reason, Ingress port
          TCA_WJH_BUFFER_DROP_AGG_UPPERHostname, Reason
          TCA_WJH_SYMBOL_ERROR_UPPERHostname, Port down reason
          TCA_WJH_CRC_ERROR_UPPERHostname, Port down reason

          Specify the Scope

          A rule’s scope can include all monitored devices or a subset. You define scopes as regular expressions, which is how they appear in NetQ. Each event has a set of attributes you can use to apply the rule to a subset of all devices. The definition and display is slightly different between the NetQ UI and the NetQ CLI, but the results are the same.

          You define the scope in the Choose Attributes step when creating a TCA event rule. You can choose to apply the rule to all devices or narrow the scope using attributes. If you choose to narrow the scope, but then do not enter any values for the available attributes, the result is all devices and attributes.

          Scopes appear in TCA rule cards using the following format: Attribute, Operation, Value.

          In this example, three attributes are available. For one or more of these attributes, select the operation (equals or starts with) and enter a value. For drop reasons, click in the value field to open a list of reasons, and select one from the list.

          Note that you should leave the drop type attribute blank.

          Create rule to show events from a …AttributeOperationValue
          Single devicehostnameEquals<hostname> such as spine01
          Single interfaceifnameEquals<interface-name> such as swp6
          Single sensors_nameEquals<sensor-name> such as fan2
          Single WJH drop reasonreason or port_down_reasonEquals<drop-reason> such as WRED
          Single WJH ingress portingress_portEquals<port-name> such as 47
          Set of deviceshostnameStarts with<partial-hostname> such as leaf
          Set of interfacesifnameStarts with<partial-interface-name> such as swp or eth
          Set of sensorss_nameStarts with<partial-sensor-name> such as fan, temp, or psu

          Refer to WJH Event Messages Reference for WJH drop types and reasons. Leaving an attribute value blank defaults to all: all hostnames, interfaces, sensors, forwarding resources, ACL resources, and so forth.

          Each attribute is displayed on the rule card as a regular expression equivalent to your choices above:

          • Equals is displayed as an equals sign (=)
          • Starts with is displayed as a caret (^)
          • Blank (all) is displayed as an asterisk (*)

          Scopes are defined with regular expressions. When more than one scoping parameter is available, they must be separated by a comma (without spaces), and all parameters must be defined in order. When an asterisk (*) is used alone, it must be entered inside either single or double quotes. Single quotes are used here.

          The single hostname scope parameter is used by the ACL resources, forwarding resources, and resource utilization events.

          Scope ValueExampleResult
          <hostname>leaf01Deliver events for the specified device
          <partial-hostname>*leaf*Deliver events for devices with hostnames starting with specified text (leaf)

          The hostname and interface scope parameters are used by the digital optics, interface errors, interface statistics, and link flaps events.

          Scope ValueExampleResult
          <hostname>,<interface>leaf01,swp9Deliver events for the specified interface (swp9) on the specified device (leaf01)
          <hostname>,'*'leaf01,'*'Deliver events for all interfaces on the specified device (leaf01)
          '*',<interface>'*',swp9Deliver events for the specified interface (swp9) on all devices
          <partial-hostname>*,<interface>leaf*,swp9Deliver events for the specified interface (swp9) on all devices with hostnames starting with the specified text (leaf)
          <hostname>,<partial-interface>*leaf01,swp*Deliver events for all interface with names starting with the specified text (swp) on the specified device (leaf01)

          The hostname and sensor name scope parameters are used by the sensor events.

          Scope ValueExampleResult
          <hostname>,<sensorname>leaf01,fan1Deliver events for the specified sensor (fan1) on the specified device (leaf01)
          '*',<sensorname>'*',fan1Deliver events for the specified sensor (fan1) for all devices
          <hostname>,'*'leaf01,'*'Deliver events for all sensors on the specified device (leaf01)
          <partial-hostname>*,<interface>leaf*,fan1Deliver events for the specified sensor (fan1) on all devices with hostnames starting with the specified text (leaf)
          <hostname>,<partial-sensorname>*leaf01,fan*Deliver events for all sensors with names starting with the specified text (fan) on the specified device (leaf01)

          The hostname, reason/port down reason, ingress port, and drop type scope parameters are used by the What Just Happened events.

          Scope ValueExampleResult
          <hostname>,<reason>,<ingress_port>,<drop_type>leaf01,ingress-port-acl,'*','*'Deliver WJH events for all ports on the specified device (leaf01) with the specified reason triggered (ingress-port-acl exceeded the threshold)
          '*',<reason>,'*''*',tail-drop,'*'Deliver WJH events for the specified reason (tail-drop) for all devices
          <partial-hostname>*,<port_down_reason>,<drop_type>leaf*,calibration-failure,'*'Deliver WJH events for the specified reason (calibration-failure) on all devices with hostnames starting with the specified text (leaf)
          <hostname>,<partial-reason>*,<drop_type>leaf01,blackhole,'*'Deliver WJH events for reasons starting with the specified text (blackhole [route]) on the specified device (leaf01)

          Create a TCA Rule

          To create a TCA rule:

          1. Click Menu and navigate to Threshold Crossing Rules.

          2. Select the event type for the rule you want to create.

          3. Click Create a rule. Enter a name for the rule and assign a severity, then click Next.

          1. Select the attribute you want to monitor. The listed attributes change depending on the type of event you chose in the previous step.

          2. Click Next.

          3. On the Set threshold step, enter a threshold value.

          For digital optics, you can choose to use the thresholds defined by the optics vendor (default) or specify your own.
          1. Define the scope of the rule.

            • If you want to restrict the rule based on a particular parameter, enter values for one or more of the available attributes. For What Just Happened rules, select a reason from the available list.

            • If you want the rule to apply to across the network, select the Apply rule to entire network toggle.

          1. Click Next.

          2. (Optional) Select a notification channel where you want the events to be sent.

            Only previously created channels are available for selection. If no channel is available or selected, the notifications can only be retrieved from the database. You can add a channel at a later time and then add it to the rule. Refer to Create a Channel and Modify TCA Rules.

          3. Click Finish. The rules may take several minutes to appear in the UI.

          The simplest configuration you can create is one that sends a TCA event generated by all devices and all interfaces to a single notification application. Use the netq add tca command to configure the event. Its syntax is:

          netq add tca [event_id <text-event-id-anchor>] [scope <text-scope-anchor>] [tca_id <text-tca-id-anchor>] [severity info | severity error] [is_active true | is_active false] [suppress_until <text-suppress-ts>] [threshold_type user_set | threshold_type vendor_set] [threshold <text-threshold-value>] [channel <text-channel-name-anchor> | channel drop <text-drop-channel-name>]
          

          Note that the event ID is case sensitive and must be in all uppercase.

          For example, this rule tells NetQ to deliver an event notification to the tca_slack_ifstats pre-configured Slack channel when the CPU utilization exceeds 95% of its capacity on any monitored switch:

          cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' channel tca_slack_ifstats threshold 95
          

          This rule tells NetQ to deliver an event notification to the tca_pd_ifstats PagerDuty channel when the number of transmit bytes per second (Bps) on the leaf12 switch exceeds 20,000 Bps on any interface:

          cumulus@switch:~$ netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' channel tca_pd_ifstats threshold 20000
          

          This rule tells NetQ to deliver an event notification to the syslog-netq syslog channel when the temperature on sensor temp1 on the leaf12 switch exceeds 32 degrees Celcius:

          cumulus@switch:~$ netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf12,temp1 channel syslog-netq threshold 32
          

          This rule tells NetQ to deliver an event notification to the tca-slack channel when the total number of ACL drops on the leaf04 switch exceeds 20,000 for any reason, ingress port, or drop type.

          cumulus@switch:~$ netq add tca event_id TCA_WJH_ACL_DROP_AGG_UPPER scope leaf04,'*','*','*' channel tca-slack threshold 20000
          

          For a Slack channel, the event messages should be similar to this:

          Set the Severity of a Threshold-based Event

          In addition to defining a scope for TCA rule, you can also set a severity of either info or error. To add a severity to a rule, use the severity option.

          For example, if you want to add an error severity to the CPU utilization rule you created earlier:

          cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope '*' severity error channel tca_slack_resources threshold 95
          

          Or if an event is important, but not an error. Set the severity to info:

          cumulus@switch:~$ netq add tca event_id TCA_TXBYTES_UPPER scope leaf12,'*' severity info channel tca_pd_ifstats threshold 20000
          

          Set the Threshold for Digital Optics Events

          Digital optics have the additional option of applying user- or vendor-defined thresholds, using the threshold_type and threshold options.

          This example shows how to send an alarm event on channel ch1 when the upper threshold for module voltage exceeds the vendor-defined thresholds for interface swp31 on the mlx-2700-04 switch.

          cumulus@switch:~$ netq add tca event_id TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER scope 'mlx-2700-04,swp31' severity error is_active true threshold_type vendor_set channel ch1
          Successfully added/updated tca
          

          This example shows how to send an alarm event on channel ch1 when the upper threshold for module voltage exceeds the user-defined threshold of 3V for interface swp31 on the mlx-2700-04 switch.

          cumulus@switch:~$ netq add tca event_id TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER scope 'mlx-2700-04,swp31' severity error is_active true threshold_type user_set threshold 3 channel ch1
          Successfully added/updated tca
          

          Create Multiple Rules for a TCA Event

          You may want to create more than one rule per event. For example, you might want to:

          To do this in the NetQ UI, create additional rule cards (as shown in the previous section).

          In the NetQ CLI, you can also add multiple rules. The following example shows the creation of three additional rules for the max temperature sensor:

          netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf*,temp1 channel syslog-netq threshold 32
          
          netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope '*',temp1 channel tca_sensors,tca_pd_sensors threshold 32
          
          netq add tca event_id TCA_SENSOR_TEMPERATURE_UPPER scope leaf03,temp1 channel syslog-netq threshold 29
          

          Now you have four rules created (the original one, plus these three new ones) all based on the TCA_SENSOR_TEMPERATURE_UPPER event. To identify the various rules, NetQ automatically generates a TCA name for each rule. As you create each rule, NetQ adds an _# to the event name. The TCA Name for the first rule created is then TCA_SENSOR_TEMPERATURE_UPPER_1, the second rule created for this event is TCA_SENSOR_TEMPERATURE_UPPER_2, and so forth.

          Manage Threshold-based Event Notifications

          View TCA Rules

          1. Click Menu and navigate to Threshold Crossing Rules.

          2. The UI displays a card for each rule.

          After creating a rule, you can use the filters that appear above the rule cards to filter by status, severity, channel, and/or events.

          To view TCA rules, run:

          netq show tca [tca_id <text-tca-id-anchor>] [json]
          

          This example displays all TCA rules:

          cumulus@switch:~$ netq show tca
          Matching config_tca records:
          TCA Name                     Event Name           Scope                      Severity Channel/s          Active Threshold          Unit     Threshold Type Suppress Until
          ---------------------------- -------------------- -------------------------- -------- ------------------ ------ ------------------ -------- -------------- ----------------------------
          TCA_CPU_UTILIZATION_UPPER_1  TCA_CPU_UTILIZATION_ {"hostname":"leaf01"}      info     pd-netq-events,slk True   87                 %        user_set       Fri Oct  9 15:39:35 2020
                                       UPPER                                                    -netq-events
          TCA_CPU_UTILIZATION_UPPER_2  TCA_CPU_UTILIZATION_ {"hostname":"*"}           error    slk-netq-events    True   93                 %        user_set       Fri Oct  9 15:39:56 2020
                                       UPPER
          TCA_DOM_BIAS_CURRENT_ALARM_U TCA_DOM_BIAS_CURRENT {"hostname":"leaf*","ifnam error    slk-netq-events    True   0                  mA       vendor_set     Fri Oct  9 16:02:37 2020
          PPER_1                       _ALARM_UPPER         e":"*"}
          TCA_DOM_RX_POWER_ALARM_UPPER TCA_DOM_RX_POWER_ALA {"hostname":"*","ifname":" info     slk-netq-events    True   0                  mW       vendor_set     Fri Oct  9 15:25:26 2020
          _1                           RM_UPPER             *"}
          TCA_SENSOR_TEMPERATURE_UPPER TCA_SENSOR_TEMPERATU {"hostname":"leaf","s_name error    slk-netq-events    True   32                 degreeC  user_set       Fri Oct  9 15:40:18 2020
          _1                           RE_UPPER             ":"temp1"}
          TCA_TCAM_IPV4_ROUTE_UPPER_1  TCA_TCAM_IPV4_ROUTE_ {"hostname":"*"}           error    pd-netq-events     True   20000              %        user_set       Fri Oct  9 16:13:39 2020
                                       UPPER
          

          This example displays a specific TCA rule:

          cumulus@switch:~$ netq show tca tca_id TCA_TXMULTICAST_UPPER_1
          Matching config_tca records:
          TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
          ---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
          TCA_TXMULTICAST_UPPER_1      TCA_TXMULTICAST_UPPE {"ifname":"swp3","hostname info             tca-tx-bytes-slack True   0                  Sun Dec  8 16:40:14 2269
                                       R                    ":"leaf01"}
          

          Change the Threshold on a TCA Rule

          After receiving notifications based on a rule, you might want to increase or decrease the threshold value to limit or increase the number of events you receive.

          To modify the threshold:

          1. Locate the rule you want to modify and hover over the top of the card.

          2. Click Edit.

          1. Enter a new threshold value, then select Update rule.

          To modify the threshold, run:

          netq add tca tca_id <text-tca-id-anchor> threshold <text-threshold-value>
          

          This example changes the threshold for the rule TCA_CPU_UTILIZATION_UPPER_1 to a value of 96 percent. This overwrites the existing threshold value.

          cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_1 threshold 96
          

          Change the Scope of a TCA Rule

          After receiving notifications based on a rule, you might find that you want to narrow or widen the scope value to limit or increase the number of events you receive.

          To modify the scope:

          1. Locate the rule you want to modify and hover over the top of the card.

          2. Click Edit.

          3. Change the scope, applying the rule to all devices or broadening or narrowing the scope. Refer to Specify the Scope for details.

          4. Select the toggle or define one or more hosts on which to apply this rule.

          1. Click Update rule.

          To modify the scope, run:

          netq add tca event_id <text-event-id-anchor> scope <text-scope-anchor> threshold <text-threshold-value>
          

          This example changes the scope for the rule TCA_CPU_UTILIZATION_UPPER to apply only to switches beginning with a hostname of leaf. You must also provide a threshold value. This example case uses a value of 95 percent. Note that this overwrites the existing scope and threshold values.

          cumulus@switch:~$ netq add tca event_id TCA_CPU_UTILIZATION_UPPER scope hostname^leaf threshold 95
          Successfully added/updated tca
          
          cumulus@switch:~$ netq show tca
          
          Matching config_tca records:
          TCA Name                     Event Name           Scope                      Severity         Channel/s          Active Threshold          Suppress Until
          ---------------------------- -------------------- -------------------------- ---------------- ------------------ ------ ------------------ ----------------------------
          TCA_CPU_UTILIZATION_UPPER_1  TCA_CPU_UTILIZATION_ {"hostname":"*"}           error            onprem-email       True   93                 Mon Aug 31 20:59:57 2020
                                       UPPER
          TCA_CPU_UTILIZATION_UPPER_2  TCA_CPU_UTILIZATION_ {"hostname":"hostname^leaf info                                True   95                 Tue Sep  1 18:47:24 2020
                                       UPPER                "}
          
          

          Change, Add, or Remove the Channels on a TCA Rule

          1. Locate the rule you want to modify and hover over the top of the card.

          2. Click Edit.

          3. Select the Channels tab.

          4. Select one or more channels.

          5. Click Update rule.

          To change a channel association, run:

          netq add tca tca_id <text-tca-id-anchor> channel <text-channel-name-anchor>
          

          This overwrites the existing channel association.

          This example shows the changing of the channel for the disk utilization 1 rule to a PagerDuty channel pd-netq-events.

          cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 channel pd-netq-events
          Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
          

          To remove a channel association (stop sending events to a particular channel), run:

          netq add tca tca_id <text-tca-id-anchor> channel drop <text-drop-channel-name>
          

          This example removes the tca_slack_resources channel from the disk utilization 1 rule.

          cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 channel drop tca_slack_resources
          Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
          

          Change the Name of a TCA Rule

          You cannot change the name of a TCA rule using the NetQ CLI because the rules do not have names. They receive identifiers (the tca_id) automatically. In the NetQ UI, to change a rule name, you must delete the rule and re-create it with the new name. Refer to Delete a TCA Rule and then Create a TCA Rule.

          Change the Severity of a TCA Rule

          TCA rules are categorized as either informational or error.

          In the NetQ UI, you must delete the rule and re-create it specifying the new severity. Refer to Delete a TCA Rule and then Create a TCA Rule.

          In the NetQ CLI, to change the severity, run:

          netq add tca tca_id <text-tca-id-anchor> (severity info | severity error)
          

          This example changes the severity of the maximum CPU utilization 1 rule from error to info:

          cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_1 severity info
          Successfully added/updated tca TCA_CPU_UTILIZATION_UPPER_1
          

          Suppress a TCA Rule

          During troubleshooting or switch maintenance, you might want to suppress a rule to prevent erroneous or excessive event messages.

          The TCA rules have three possible states in the NetQ UI:

          • Active: Rule is operating, delivering events. This is the normal operating state.
          • Suppressed: Rule is disabled until a designated date and time. When that time occurs, the rule is automatically reenabled.
          • Disabled: Rule is disabled until a user manually reenables it. This state is useful when you are unclear when you want the rule to be reenabled. This is not the same as deleting the rule.

          To suppress a rule for a designated amount of time, you must change the state of the rule:

          1. Locate the rule you want to suppress.

          2. Click Disable.

          3. Click in the Date/Time field to set when you want the rule to be reenabled.

          4. Click Disable.

          Note the changes in the card:
          • The state changes to Snoozed
          • The Suppressed field displays the date and time at which the rule will be reenabled.
          • The Disable button changes to Disable forever.

          Using the suppress_until option allows you to prevent the rule from being applied for a designated amout of time (in seconds). When this time has passed, the rule is automatically reenabled.

          To suppress a rule, run:

          netq add tca tca_id <text-tca-id-anchor> suppress_until <text-suppress-ts>
          

          This example suppresses the maximum cpu utilization event for 24 hours:

          cumulus@switch:~$ netq add tca tca_id TCA_CPU_UTILIZATION_UPPER_2 suppress_until 86400
          Successfully added/updated tca TCA_CPU_UTILIZATION_UPPER_2
          

          Disable a TCA Rule

          Whereas suppression temporarily disables a rule, you can deactivate a rule to disable it indefinitely.

          To disable a rule that is currently active:

          1. Locate the rule you want to disable.

          2. Click Disable.

          3. Leave the Date/Time field blank.

          4. Click Disable.

          Note the changes in the card:
          • The state is now marked as Inactive and is red
          • The rule definition is grayed out
          • The Disable option has changed to Enable to reactivate the rule when you are ready

          To disable a rule that is currently suppressed, click Disable Forever.

          To disable a rule, run:

          netq add tca tca_id <text-tca-id-anchor> is_active false
          

          This example disables the maximum disk utilization 1 rule:

          cumulus@switch:~$ netq add tca tca_id TCA_DISK_UTILIZATION_UPPER_1 is_active false
          Successfully added/updated tca TCA_DISK_UTILIZATION_UPPER_1
          

          To reenable the rule, set the is_active option to true.

          Delete a TCA Rule

          You can either disable an event (if you think you might want to receive event messages again) or delete a rule altogether. Refer to Disable a TCA Rule for the first case.

          To delete a rule:

          1. Locate the rule you want to remove and hover over the card.

          2. Click in the card’s top-right corner.

          To remove a rule altogether, run:

          netq del tca tca_id <text-tca-id-anchor>
          

          This example deletes the maximum receive bytes rule:

          cumulus@switch:~$ netq del tca tca_id TCA_RXBYTES_UPPER_1
          Successfully deleted TCA TCA_RXBYTES_UPPER_1
          

          Resolve Scope Conflicts

          There might be occasions where the scope defined by the multiple rules for a given TCA event might overlap each other. In such cases, NetQ uses the TCA rule with the most specific scope that is still true to generate the event.

          To clarify this, consider this example. Three events occurred:

          NetQ attempts to match the TCA event against hostname and interface name with three TCA rules with different scopes:

          The result is:

          In summary:

          Input EventScope ParametersTCA Scope 1TCA Scope 2TCA Scope 3Scope Applied
          leaf01,swp1Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 3
          leaf01,swp3Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 2
          spine01,swp1Hostname, Interface'*','*'leaf*,'*'leaf01,swp1Scope 1

          Modify your TCA rules to remove the conflict.

          Monitor Events

          You can monitor both system and threshold-based (TCA) events with the UI or CLI. System events include events associated with network protocols and services operation, hardware and software status, and system services. TCA events include events associated with digital optics, ACL and forwarding resources, interface statistics, resource utilization, and sensors. You can view all events across the entire network or all events on a device, then filter your view of events based on event type, severity, and timeframe.

          Refer to Configure System Event Notifications and Configure Threshold-Based Event Notifications for information about configuring and managing these events.

          Note that in the UI, it can take several minutes for NetQ to process and accurately display network validation events. The delay is caused by events with multiple network dependencies. It takes between 5 and 10 minutes for NetQ to consolidate and display these events.

          Monitor All System and TCA Events Networkwide

          1. Click Menu.

          2. In the side navigation under Network, click Events.

            The dashboard presents a timeline of events alongside the devices that are causing the most events. You can filter events by type, including interface, network services, system, and threshold crossing events. The filter controls are located at the top of the screen.

          Events dashboard with networkwide error and info events.

          If you are receiving too many event notifications, you can create rules to suppress events. Select Show suppression rules in the top-right corner to view rules that prevent NetQ from displaying an event message. Refer to Configure System Event Notifications for information about event suppression.

          Events are also generated when streaming validation checks detect a failure. If an event is generated from a failed validation check, it will be marked resolved automatically the next time the check runs successfully.

          To view all system and all TCA events, run:

          netq show events [between <text-time> and <text-endtime>] [json]
          

          This example shows all system and TCA events between now and an hour ago.

          netq show events
          cumulus@switch:~$ netq show events
          Matching events records:
          Hostname          Message Type             Severity         Message                             Timestamp
          ----------------- ------------------------ ---------------- ----------------------------------- -------------------------
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 20:04:30 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:55:26 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:34:29 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:25:24 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          

          This example shows all events between now and 24 hours ago.

          netq show events between now and 24hr
          cumulus@switch:~$ netq show events between now and 24hr
          Matching events records:
          Hostname          Message Type             Severity         Message                             Timestamp
          ----------------- ------------------------ ---------------- ----------------------------------- -------------------------
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 20:04:30 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:55:26 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:34:29 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:25:24 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:04:22 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:55:17 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:34:21 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:25:16 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:04:19 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 17:55:15 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 17:34:18 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          ...
          

          Monitor All System and TCA Events on a Device

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, click the Hostname field and select a device.

          4. Click Apply.

          To view all system and TCA events on a switch, run:

          netq <hostname> show events [between <text-time> and <text-endtime>] [json]
          

          This example shows all system and TCA events that have occurred on the leaf01 switch between now and an hour ago.

          cumulus@switch:~$ netq leaf01 show events
          
          Matching events records:
          Hostname          Message Type             Severity         Message                             Timestamp
          ----------------- ------------------------ ---------------- ----------------------------------- -------------------------
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 20:34:31 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 20:04:30 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          

          This example shows that no events have occurred on the spine01 switch in the last hour.

          cumulus@switch:~$ netq spine01 show events
          No matching event records found
          

          Monitor System and TCA Events Networkwide by Type

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, click the Type field and select a network protocol or service.

          4. Click Apply.

          To view all system events for a given network protocol or service, run:

          netq [<hostname>] show events [severity info | severity error ] [message_type link | message_type interfaces | message_type evpn | message_type bgp | message_type vxlan | message_type vlan | message_type ntp | message_type ospf | message_type lldp | message_type roceconfig | message_type mlag | message_type agent | message_type node | message_type mtu | message_type license | message_type sensor | message_type port | message_type configdiff  | message_type services | message_type clsupport | message_type runningconfigdiff | message_type resource | message_type btrfsinfo  | message_type ssdutil | message_type lcm | message_type ptm | message_type trace | message_type cable | message_type tca_resource | message_type tca_sensors | message_type tca_procdevstats | message_type tca_dom | message_type tca_link | message_type tca_ethtool | message_type tca_wjh | message_type tca_roce | message_type tca_bgp | message_type tca_ecmp ] [between <text-time> and <text-endtime>] [json]
          

          Monitor System and TCA Events on a Device by Type

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, click the Hostname field and select a device.

          4. In the same row, click the Type field and select a network protocol or service.

          5. Click Apply.

          To view all system events for a given network protocol or service, run:

          netq [<hostname>] show events [severity info | severity error ] [message_type link | message_type interfaces | message_type evpn | message_type bgp | message_type vxlan | message_type vlan | message_type ntp | message_type ospf | message_type lldp | message_type roceconfig | message_type mlag | message_type agent | message_type node | message_type mtu | message_type license | message_type sensor | message_type port | message_type configdiff  | message_type services | message_type clsupport | message_type runningconfigdiff | message_type resource | message_type btrfsinfo  | message_type ssdutil | message_type lcm | message_type ptm | message_type trace | message_type cable | message_type tca_resource | message_type tca_sensors | message_type tca_procdevstats | message_type tca_dom | message_type tca_link | message_type tca_ethtool | message_type tca_wjh | message_type tca_roce | message_type tca_bgp | message_type tca_ecmp ] [between <text-time> and <text-endtime>] [json]
          

          Monitor System and TCA Events Networkwide by Severity

          System event severities include info, error, warning, or debug. TCA event severities include info or error.

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, click the Severity field and select a level.

          4. Click Apply.

          To view all system events of a given severity, run:

          netq show events [severity info | severity error ] [between <text-time> and <text-endtime>] [json]
          

          Monitor System and TCA Events on a Device by Severity

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, click the Hostname field and select a device.

          4. In the same row, click the Severity field and select a level.

          5. Click Apply.

          To view all system events for a given severity on a device, run:

          netq <hostname> show events [severity info | severity error ]  [between <text-time> and <text-endtime>] [json]
          

          Monitor System and TCA Events Networkwide by Time

          1. Click Menu.

          2. In the side navigation under Network, click Events.

          3. At the top of the screen, use the first two fields to filter either over a time range or by recent events.

          4. Click Apply.

          The NetQ CLI uses a default of one hour unless otherwise specified. To view all system and all TCA events for a time beyond an hour in the past, run:

          netq show events [between <text-time> and <text-endtime>] [json]
          

          This example shows all system and TCA events between now and 24 hours ago.

          netq show events between now and 24hr
          cumulus@switch:~$ netq show events between now and 24hr
          Matching events records:
          Hostname          Message Type             Severity         Message                             Timestamp
          ----------------- ------------------------ ---------------- ----------------------------------- -------------------------
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 20:04:30 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:55:26 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:34:29 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:25:24 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 19:04:22 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:55:17 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:34:21 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:25:16 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 18:04:19 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 17:55:15 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  2 17:34:18 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          ...
          

          This example shows all system and TCA events between one and three days ago.

          cumulus@switch:~$ netq show events between 1d and 3d
          
          Matching events records:
          Hostname          Message Type             Severity         Message                             Timestamp
          ----------------- ------------------------ ---------------- ----------------------------------- -------------------------
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 16:14:37 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 16:03:31 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 15:44:36 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 15:33:30 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 15:14:35 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 15:03:28 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf01            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 14:44:34 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          leaf02            btrfsinfo                error            data storage efficiency : space lef Wed Sep  9 14:33:21 2020
                                                                      t after allocation greater than chu
                                                                      nk size 0.57 GB
          ...
          

          Configure and Monitor What Just Happened

          The What Just Happened (WJH) feature, available on NVIDIA Spectrum switches, streams detailed and contextual telemetry data for analysis. This provides real-time visibility into problems in the network, such as hardware packet drops due to buffer congestion, incorrect routing, and ACL or layer 1 problems.

          For a list of supported WJH events, refer to the WJH Event Messages Reference.

          To use a gNMI client to export WJH data to a collector, refer to Collect WJH Data with gNMI.

          WJH is only supported on NVIDIA Spectrum switches. WJH latency and congestion monitoring is supported on NVIDIA Spectrum 2 switches and above. WJH requires Cumulus Linux 4.4.0 or later. SONiC only supports collection of WJH data with gNMI.

          Using WJH in combination with NetQ helps you identify losses anywhere in the fabric. From a single management console you can:

          By default, Cumulus Linux 4.4.0 and later provides the NetQ Agent and CLI. Depending on the version of Cumulus Linux running on your NVIDIA switch, you might need to upgrade the NetQ Agent and CLI to the latest release:

          cumulus@<hostname>:~$ sudo apt-get update
          cumulus@<hostname>:~$ sudo apt-get install -y netq-agent
          cumulus@<hostname>:~$ sudo netq config restart agent
          cumulus@<hostname>:~$ sudo apt-get install -y netq-apps
          cumulus@<hostname>:~$ sudo netq config restart cli
          

          Configure the WJH Feature

          WJH is enabled by default on NVIDIA switches and Cumulus Linux 4.4.0 requires no configuration; however, you must enable the NetQ Agent to collect the data.

          To enable WJH in NetQ on any switch or server:

          1. Configure the NetQ Agent on the NVIDIA switch.

            cumulus@switch:~$ sudo netq config add agent wjh
            
          2. Restart the NetQ Agent to start collecting the WJH data.

            cumulus@switch:~$ sudo netq config restart agent
            

          When you finish viewing the WJH metrics, you might want to stop the NetQ Agent from collecting WJH data to reduce network traffic. Use netq config del agent wjh followed by netq config restart agent to disable the WJH feature on the given switch.

          Using wjh_dump.py on an NVIDIA platform that is running Cumulus Linux and the NetQ agent causes the NetQ WJH client to stop receiving packet drop call backs. To prevent this issue, run wjh_dump.py on a different system than the one where the NetQ Agent has WJH enabled, or disable wjh_dump.py and restart the NetQ Agent (run netq config restart agent).

          Configure Latency and Congestion Thresholds

          WJH latency and congestion metrics depend on threshold settings to trigger the events. WJH measures packet latency as the time spent inside a single system (switch). When specified, WJH triggers events when measured values cross high thresholds and events are suppressed when values are below low thresholds.

          To configure these thresholds, run:

          netq config add agent wjh-threshold (latency|congestion) <text-tc-list> <text-port-list> <text-th-hi> <text-th-lo>
          

          You can specify multiple traffic classes and multiple ports by separating the classes or ports by a comma (no spaces).

          The following example creates latency thresholds for Class 3 traffic on port swp1 where the upper threshold is 10 usecs and the lower threshold is 1 usec:

          cumulus@switch:~$ sudo netq config add agent wjh-threshold latency 3 swp1 10 1
          

          This example creates congestion thresholds for Class 4 traffic on port swp1 where the upper threshold is 200 cells and the lower threshold is 10 cells, where a cell is a unit of 144 bytes:

          cumulus@switch:~$ sudo netq config add agent wjh-threshold congestion 4 swp1 200 10
          

          Configure Filters

          You can filter WJH events by drop type at the NetQ Agent before the NetQ system processes it. You can filter the drop type further by specifying one or more drop reasons or severity. Filter events by creating a NetQ configuration profile in the NetQ UI or using the netq config add agent wjh-drop-filter command in the NetQ CLI.

          For a complete list of drop types and reasons, refer to the WJH Event Messages Reference.

          To configure the NetQ Agent to filter WJH drops:

          1. Click Upgrade in a workbench header.

          2. Select NetQ Agent Configurations.

          3. On the NetQ Configurations card, click Add Config.

          4. Click Enable to enable WJH, then click Customize:

            modal describing WJH event capture options

          5. By default, WJH includes all drop reasons and severities. Uncheck any drop reasons or severity you do not want to generate WJH events, then click Done.

          6. Click Add to save the configuration profile, or click Close to discard it.

          To configure the NetQ Agent to filter WJH drops, run:

          netq config add agent wjh-drop-filter drop-type <text-wjh-drop-type> [drop-reasons <text-wjh-drop-reasons>] [severity <text-drop-severity-list>]
          

          Use tab complete to view the available drop type, drop reason, and severity values.

          This example configures the NetQ Agent to drop all L1 drops.

          cumulus@switch:~$ sudo netq config add agent wjh-drop-filter drop-type l1
          

          This example configures the NetQ Agent to drop only the L1 drops with bad signal integrity.

          cumulus@switch:~$ sudo netq config add agent wjh-drop-filter drop-type l1 drop-reasons BAD_SIGNAL_INTEGRITY
          

          This example configures the NetQ Agent to drop only router drops with warning severity.

          cumulus@switch:~$ sudo netq config add agent wjh-drop-filter drop-type router severity Warning
          

          This example configures the NetQ Agent to drop only router drops due to blackhole routes.

          cumulus@netq-ts:~$ netq config add agent wjh-drop-filter drop-type router drop-reasons BLACKHOLE_ROUTE
          

          This example configures the NetQ Agent to drop only router drops when the source IP is a class E address.

          cumulus@netq-ts:~$ netq config add agent wjh-drop-filter drop-type router drop-reasons SRC_IP_IS_IN_CLASS_E
          

          View What Just Happened Metrics

          You can view the WJH metrics from the NetQ UI or the NetQ CLI. WJH metrics are visible on the WJH card and the Events card. To view the metrics on the Events card, open the medium-sized card and hover over most-active devices. For a more detailed view, open the WJH card.

          Open the What Just Happened card on your workbench:

          what just happened card displaying errors and warnings

          You can expand the card to see a detailed summary of WJH data:

          expanded what just happened card displaying devices with the most drops

          Expanding the card to its largest size will open the advanced WJH dashboard. You can also access this dashboard by clicking Menu and selecting What Just Happened under the Network column:

          fully expanded what just happened card with detailed drop information

          Hover over the color-coded chart to view and expand individual WJH event categories:

          donut chart displaying types of drops

          Click on a category in the chart for a detailed view:

          donut chart and graph displaying detailed drop information

          Run one of the following commands:

          netq [<hostname>] show wjh-drop <text-drop-type> [ingress-port <text-ingress-port>] [severity <text-severity>] [reason <text-reason>] [src-ip <text-src-ip>] [dst-ip <text-dst-ip>] [proto <text-proto>] [src-port <text-src-port>] [dst-port <text-dst-port>] [src-mac <text-src-mac>] [dst-mac <text-dst-mac>] [egress-port <text-egress-port>] [traffic-class <text-traffic-class>] [rule-id-acl <text-rule-id-acl>] [between <text-time> and <text-endtime>] [around <text-time>] [json]
          netq [<hostname>] show wjh-drop [ingress-port <text-ingress-port>] [severity <text-severity>] [details] [between <text-time> and <text-endtime>] [around <text-time>] [json]
          

          Use the various options to restrict the output accordingly.

          This example uses the first form of the command to show drops on switch leaf03 for the past week.

          cumulus@switch:~$ netq leaf03 show wjh-drop between now and 7d
          Matching wjh records:
          Drop type          Aggregate Count
          ------------------ ------------------------------
          L1                 560
          Buffer             224
          Router             144
          L2                 0
          ACL                0
          Tunnel             0
          

          This example uses the second form of the command to show drops on switch leaf03 for the past week including the drop reasons.

          cumulus@switch:~$ netq leaf03 show wjh-drop details between now and 7d
          
          Matching wjh records:
          Drop type          Aggregate Count                Reason
          ------------------ ------------------------------ ---------------------------------------------
          L1                 556                            None
          Buffer             196                            WRED
          Router             144                            Blackhole route
          Buffer             14                             Packet Latency Threshold Crossed
          Buffer             14                             Port TC Congestion Threshold
          L1                 4                              Oper down
          

          This example shows the drops seen at layer 2 across the network.

          cumulus@mlx-2700-03:mgmt:~$ netq show wjh-drop l2
          Matching wjh records:
          Hostname          Ingress Port             Reason                                        Agg Count          Src Ip           Dst Ip           Proto  Src Port         Dst Port         Src Mac            Dst Mac            First Timestamp                Last Timestamp
          ----------------- ------------------------ --------------------------------------------- ------------------ ---------------- ---------------- ------ ---------------- ---------------- ------------------ ------------------ ------------------------------ ----------------------------
          mlx-2700-03       swp1s2                   Port loopback filter                          10                 27.0.0.19        27.0.0.22        0      0                0                00:02:00:00:00:73  0c:ff:ff:ff:ff:ff  Mon Dec 16 11:54:15 2019       Mon Dec 16 11:54:15 2019
          mlx-2700-03       swp1s2                   Source MAC equals destination MAC             10                 27.0.0.19        27.0.0.22        0      0                0                00:02:00:00:00:73  00:02:00:00:00:73  Mon Dec 16 11:53:17 2019       Mon Dec 16 11:53:17 2019
          mlx-2700-03       swp1s2                   Source MAC equals destination MAC             10                 0.0.0.0          0.0.0.0          0      0                0                00:02:00:00:00:73  00:02:00:00:00:73  Mon Dec 16 11:40:44 2019       Mon Dec 16 11:40:44 2019
          

          The following two examples include the severity of a drop event (error, warning or notice) for ACLs and routers.

          cumulus@switch:~$ netq show wjh-drop acl
          Matching wjh records:
          Hostname          Ingress Port             Reason                                        Severity         Agg Count          Src Ip           Dst Ip           Proto  Src Port         Dst Port         Src Mac            Dst Mac            Acl Rule Id            Acl Bind Point               Acl Name         Acl Rule         First Timestamp                Last Timestamp
          ----------------- ------------------------ --------------------------------------------- ---------------- ------------------ ---------------- ---------------- ------ ---------------- ---------------- ------------------ ------------------ ---------------------- ---------------------------- ---------------- ---------------- ------------------------------ ----------------------------
          leaf01            swp2                     Ingress router ACL                            Error            49                 55.0.0.1         55.0.0.2         17     8492             21423            00:32:10:45:76:89  00:ab:05:d4:1b:13  0x0                    0                                                              Tue Oct  6 15:29:13 2020       Tue Oct  6 15:29:39 2020
          
          cumulus@switch:~$ netq show wjh-drop router
          Matching wjh records:
          Hostname          Ingress Port             Reason                                        Severity         Agg Count          Src Ip           Dst Ip           Proto  Src Port         Dst Port         Src Mac            Dst Mac            First Timestamp                Last Timestamp
          ----------------- ------------------------ --------------------------------------------- ---------------- ------------------ ---------------- ---------------- ------ ---------------- ---------------- ------------------ ------------------ ------------------------------ ----------------------------
          leaf01            swp1                     Blackhole route                               Notice           36                 46.0.1.2         47.0.2.3         6      1235             43523            00:01:02:03:04:05  00:06:07:08:09:0a  Tue Oct  6 15:29:13 2020       Tue Oct  6 15:29:47 2020
          

          gNMI Streaming

          You can use gRPC Network Management Interface (gNMI) to collect system resource, interface, and counter information from Cumulus Linux and export it to your own gNMI client.

          Configure the gNMI Agent

          The gNMI agent is disabled by default. To enable it, run:

           cumulus@switch:~$ netq config add agent gnmi-enable true
          

          The gNMI agent listens over port 9339. You can change the default port in case you use that port in another application. The /etc/netq/netq.yml file stores the configuration.

          Use the following commands to adjust the settings:

          1. Disable the gNMI agent:

            cumulus@switch:~$ netq config add agent gnmi-enable false
            
          2. Change the default port over which the gNMI agent listens:

            cumulus@switch:~$ netq config add agent gnmi-port <gnmi_port>
            
          3. Restart the NetQ agent to incorporate the configuration changes:

            cumulus@switch:~$ netq config restart agent
            

          Use the gNMI Agent Only

          NVIDIA recommends collecting data with both the gNMI and NetQ agents. However, if you do not want to collect data with both agents, you can disable the NetQ agent. Data is then sent exclusively to the gNMI agent.

          To disable the NetQ agent, use the following command:

          cumulus@switch:~$ netq config add agent opta-enable false
          

          You cannot disable both the NetQ and gNMI agents. If both agents are enabled on Cumulus Linux and a NetQ server is unreachable, the data from the following models are not sent to gNMI:

          • openconfig-interfaces
          • openconfig-if-ethernet
          • openconfig-if-ethernet-ext
          • openconfig-system
          • nvidia-if-ethernet-ext

          WJH, openconfig-platform, and openconfig-lldp data continue streaming to gNMI in this state. If you are only using gNMI and a NetQ telemetry server does not exist, you should disable the NetQ agent by setting opta-enable to false.

          Supported Models

          Cumulus Linux supports the following OpenConfig models:

          ModelSupported Data
          openconfig-interfacesName, Operstatus, AdminStatus, IfIndex, MTU, LoopbackMode, Enabled, Counters (InPkts, OutPkts, InOctets, InUnicastPkts, InDiscards, InMulticastPkts, InBroadcastPkts, InErrors, OutOctets, OutUnicastPkts, OutMulticastPkts, OutBroadcastPkts, OutDiscards, OutErrors)
          openconfig-if-ethernetAutoNegotiate, PortSpeed, MacAddress, NegotiatedPortSpeed, Counters (InJabberFrames, InOversizeFrames,​ InUndersizeFrames)
          openconfig-if-ethernet-extFrame size counters (InFrames_64Octets, InFrames_65_127Octets, InFrames_128_255Octets, InFrames_256_511Octets, InFrames_512_1023Octets, InFrames_1024_1518Octets)
          openconfig-systemMemory, CPU
          openconfig-platformPlatform data (Name, Description, Version)
          openconfig-lldpLLDP data (PortIdType, PortDescription, LastUpdate, SystemName, SystemDescription, ChassisId, Ttl, Age, ManagementAddress, ManagementAddressType, Capability)

          gNMI clients can also use the following NVIDIA models:

          ModelSupported Data
          nvidia-if-wjh-drop-aggregateAggregated WJH drops, including L1, L2, router, ACL, tunnel, and buffer drops
          nvidia-if-ethernet-extExtended Ethernet counters (AlignmentError, InAclDrops, InBufferDrops, InDot3FrameErrors, InDot3LengthErrors, InL3Drops, InPfc0Packets, InPfc1Packets, InPfc2Packets, InPfc3Packets, InPfc4Packets, InPfc5Packets, InPfc6Packets, InPfc7Packets, OutNonQDrops, OutPfc0Packets, OutPfc1Packets, OutPfc2Packets, OutPfc3Packets, OutPfc4Packets, OutPfc5Packets, OutPfc6Packets, OutPfc7Packets, OutQ0WredDrops, OutQ1WredDrops, OutQ2WredDrops, OutQ3WredDrops, OutQ4WredDrops, OutQ5WredDrops, OutQ6WredDrops, OutQ7WredDrops, OutQDrops, OutQLength, OutWredDrops, SymbolErrors, OutTxFifoFull)

          The client should use the following YANG models as a reference:

          nvidia-if-ethernet-ext
          module nvidia-if-ethernet-counters-ext {
              // xPath --> /interfaces/interface[name=*]/ethernet/counters/state/
          
             namespace "http://nvidia.com/yang/nvidia-ethernet-counters";
             prefix "nvidia-if-ethernet-counters-ext";
          
          
            // import some basic types
            import openconfig-interfaces { prefix oc-if; }
            import openconfig-if-ethernet { prefix oc-eth; }
            import openconfig-yang-types { prefix oc-yang; }
          
          
            revision "2021-10-12" {
              description
                "Initial revision";
              reference "1.0.0.";
            }
          
            grouping ethernet-counters-ext {
          
              leaf alignment-error {
                type oc-yang:counter64;
              }
          
              leaf in-acl-drops {
                type oc-yang:counter64;
              }
          
              leaf in-buffer-drops {
                type oc-yang:counter64;
              }
          
              leaf in-dot3-frame-errors {
                type oc-yang:counter64;
              }
          
              leaf in-dot3-length-errors {
                type oc-yang:counter64;
              }
          
              leaf in-l3-drops {
                type oc-yang:counter64;
              }
          
              leaf in-pfc0-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc1-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc2-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc3-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc4-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc5-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc6-packets {
                type oc-yang:counter64;
              }
          
              leaf in-pfc7-packets {
                type oc-yang:counter64;
              }
          
              leaf out-non-q-drops {
                type oc-yang:counter64;
              }
          
              leaf out-pfc0-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc1-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc2-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc3-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc4-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc5-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc6-packets {
                type oc-yang:counter64;
              }
          
              leaf out-pfc7-packets {
                type oc-yang:counter64;
              }
          
              leaf out-q0-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q1-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q2-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q3-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q4-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q5-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q6-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q7-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q8-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q9-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q-drops {
                type oc-yang:counter64;
              }
          
              leaf out-q-length {
                type oc-yang:counter64;
              }
          
              leaf out-wred-drops {
                type oc-yang:counter64;
              }
          
              leaf symbol-errors {
                type oc-yang:counter64;
              }
          
              leaf out-tx-fifo-full {
                type oc-yang:counter64;
              }
          
            }
          
            augment "/oc-if:interfaces/oc-if:interface/oc-eth:ethernet/" +
              "oc-eth:state/oc-eth:counters" {
                uses ethernet-counters-ext;
            }
          
          }
          
          nvidia-if-wjh-drop-aggregate
          module nvidia-wjh {
              // Entrypoint /oc-if:interfaces/oc-if:interface
              //
              // xPath L1     --> interfaces/interface[name=*]/wjh/aggregate/l1
              // xPath L2     --> /interfaces/interface[name=*]/wjh/aggregate/l2/reasons/reason[id=*][severity=*]
              // xPath Router --> /interfaces/interface[name=*]/wjh/aggregate/router/reasons/reason[id=*][severity=*]
              // xPath Tunnel --> /interfaces/interface[name=*]/wjh/aggregate/tunnel/reasons/reason[id=*][severity=*]
              // xPath Buffer --> /interfaces/interface[name=*]/wjh/aggregate/buffer/reasons/reason[id=*][severity=*]
              // xPath ACL    --> /interfaces/interface[name=*]/wjh/aggregate/acl/reasons/reason[id=*][severity=*]
          
              import openconfig-interfaces { prefix oc-if; }
          
              namespace "http://nvidia.com/yang/what-just-happened-config";
              prefix "nvidia-wjh";
          
              revision "2021-10-12" {
                  description
                      "Initial revision";
                  reference "1.0.0.";
              }
          
              augment "/oc-if:interfaces/oc-if:interface" {
                  uses interfaces-wjh;
              }
          
              grouping interfaces-wjh {
                  description "Top-level grouping for What-just happened data.";
                  container wjh {
                      container aggregate {
                          container l1 {
                              container state {
                                  leaf drop {
                                      type string;
                                      description "Drop list based on wjh-drop-types module encoded in JSON";
                                  }
                              }
                          }
                          container l2 {
                              uses reason-drops;
                          }
                          container router {
                              uses reason-drops;
                          }
                          container tunnel {
                              uses reason-drops;
                          }
                          container acl {
                              uses reason-drops;
                          }
                          container buffer {
                              uses reason-drops;
                          }
                      }
                  }
              }
          
              grouping reason-drops {
                  container reasons {
                      list reason {
                          key "id severity";
                          leaf id {
                              type leafref {
                                  path "../state/id";
                              }
                              description "reason ID";
                          }
                          leaf severity {
                              type leafref {
                                  path "../state/severity";
                              }
                              description "Reason severity";
                          }
                          container state {
                              leaf id {
                                  type uint32;
                                  description "Reason ID";
                              }
                              leaf name {
                                  type string;
                                  description "Reason name";
                              }
                              leaf severity {
                                  type string;
                                  mandatory "true";
                                  description "Reason severity";
                              }
                              leaf drop {
                                  type string;
                                  description "Drop list based on wjh-drop-types module encoded in JSON";
                              }
                          }
                      }
                  }
              }
          }
          
          module wjh-drop-types {
              namespace "http://nvidia.com/yang/what-just-happened-config-types";
              prefix "wjh-drop-types";
          
              container l1-aggregated {
                  uses l1-drops;
              }
              container l2-aggregated {
                  uses l2-drops;
              }
              container router-aggregated {
                  uses router-drops;
              }
              container tunnel-aggregated {
                  uses tunnel-drops;
              }
              container acl-aggregated {
                  uses acl-drops;
              }
              container buffer-aggregated {
                  uses buffer-drops;
              }
          
              grouping reason-key {
                  leaf id {
                      type uint32;
                      mandatory "true";
                      description "reason ID";
                  }
                  leaf severity {
                      type string;
                      mandatory "true";
                      description "Severity";
                  }
              }
          
              grouping reason_info {
                  leaf reason {
                          type string;
                          mandatory "true";
                          description "Reason name";
                  }
                  leaf drop_type {
                      type string;
                      mandatory "true";
                      description "reason drop type";
                  }
                  leaf ingress_port {
                      type string;
                      mandatory "true";
                      description "Ingress port name";
                  }
                  leaf ingress_lag {
                      type string;
                      description "Ingress LAG name";
                  }
                  leaf egress_port {
                      type string;
                      description "Egress port name";
                  }
                  leaf agg_count {
                      type uint64;
                      description "Aggregation count";
                  }
                  leaf severity {
                      type string;
                      description "Severity";
                  }
                  leaf first_timestamp {
                      type uint64;
                      description "First timestamp";
                  }
                  leaf end_timestamp {
                      type uint64;
                      description "End timestamp";
                  }
              }
          
              grouping packet_info {
                  leaf smac {
                      type string;
                      description "Source MAC";
                  }
                  leaf dmac {
                      type string;
                      description "Destination MAC";
                  }
                  leaf sip {
                      type string;
                      description "Source IP";
                  }
                  leaf dip {
                      type string;
                      description "Destination IP";
                  }
                  leaf proto {
                      type uint32;
                      description "Protocol";
                  }
                  leaf sport {
                      type uint32;
                      description "Source port";
                  }
                  leaf dport {
                      type uint32;
                      description "Destination port";
                  }
              }
          
              grouping l1-drops {
                  description "What-just happened drops.";
                  leaf ingress_port {
                      type string;
                      description "Ingress port";
                  }
                  leaf is_port_up {
                      type boolean;
                      description "Is port up";
                  }
                  leaf port_down_reason {
                      type string;
                      description "Port down reason";
                  }
                  leaf description {
                      type string;
                      description "Description";
                  }
                  leaf state_change_count {
                      type uint64;
                      description "State change count";
                  }
                  leaf symbol_error_count {
                      type uint64;
                      description "Symbol error count";
                  }
                  leaf crc_error_count {
                      type uint64;
                      description "CRC error count";
                  }
                  leaf first_timestamp {
                      type uint64;
                      description "First timestamp";
                  }
                  leaf end_timestamp {
                      type uint64;
                      description "End timestamp";
                  }
                  leaf timestamp {
                      type uint64;
                      description "Timestamp";
                  }
              }
              grouping l2-drops {
                  description "What-just happened drops.";
                  uses reason_info;
                  uses packet_info;
              }
          
              grouping router-drops {
                  description "What-just happened drops.";
                  uses reason_info;
                  uses packet_info;
              }
          
              grouping tunnel-drops {
                  description "What-just happened drops.";
                  uses reason_info;
                  uses packet_info;
              }
          
              grouping acl-drops {
                  description "What-just happened drops.";
                  uses reason_info;
                  uses packet_info;
                  leaf acl_rule_id {
                      type uint64;
                      description "ACL rule ID";
                  }
                  leaf acl_bind_point {
                      type uint32;
                      description "ACL bind point";
                  }
                  leaf acl_name {
                      type string;
                      description "ACL name";
                  }
                  leaf acl_rule {
                      type string;
                      description "ACL rule";
                  }
              }
          
              grouping buffer-drops {
                  description "What-just happened drops.";
                  uses reason_info;
                  uses packet_info;
                  leaf traffic_class {
                      type uint32;
                      description "Traffic Class";
                  }
                  leaf original_occupancy {
                      type uint32;
                      description "Original occupancy";
                  }
                  leaf original_latency {
                      type uint64;
                      description "Original latency";
                  }
              }
          }
          

          Collect WJH Data Using gNMI

          You can export What Just Happened data from the NetQ agent to your own gNMI client. Refer to the previous section for the nvidia-if-wjh-drop-aggregate reference YANG model.

          Supported Features

          WJH Drop Reasons

          The data NetQ sends to the gNMI agent is in the form of WJH drop reasons. The reasons are generated by the SDK and are stored in the /usr/etc/wjh_lib_conf.xml file on the switch. Use this file as a guide to filter for specific reason types (L1, ACL, and so forth), reason IDs, or event severities.

          L1 Drop Reasons

          Reason IDReasonDescription
          10021Port admin downValidate port configuration
          10022Auto-negotiation failureSet port speed manually, disable auto-negotiation
          10023Logical mismatch with peer linkCheck cable/transceiver
          10024Link training failureCheck cable/transceiver
          10025Peer is sending remote faultsReplace cable/transceiver
          10026Bad signal integrityReplace cable/transceiver
          10027Cable/transceiver is not supportedUse supported cable/transceiver
          10028Cable/transceiver is unpluggedPlug cable/transceiver
          10029Calibration failureCheck cable/transceiver
          10030Cable/transceiver bad statusCheck cable/transceiver
          10031Other reasonOther L1 drop reason

          L2 Drop Reasons

          Reason IDReasonSeverityDescription
          201MLAG port isolationNoticeExpected behavior
          202Destination MAC is reserved (DMAC=01-80-C2-00-00-0x)ErrorBad packet was received from the peer
          203VLAN tagging mismatchErrorValidate the VLAN tag configuration on both ends of the link
          204Ingress VLAN filteringErrorValidate the VLAN membership configuration on both ends of the link
          205Ingress spanning tree filterNoticeExpected behavior
          206Unicast MAC table action discardErrorValidate MAC table for this destination MAC
          207Multicast egress port list is emptyWarningValidate why IGMP join or multicast router port does not exist
          208Port loopback filterErrorValidate MAC table for this destination MAC
          209Source MAC is multicastErrorBad packet was received from peer
          210Source MAC equals destination MACErrorBad packet was received from peer

          Router Drop Reasons

          Reason IDReasonSeverityDescription
          301Non-routable packetNoticeExpected behavior
          302Blackhole routeWarningValidate routing table for this destination IP
          303Unresolved neighbor/next hopWarningValidate ARP table for the neighbor/next hop
          304Blackhole ARP/neighborWarningValidate ARP table for the next hop
          305IPv6 destination in multicast scope FFx0:/16NoticeExpected behavior - packet is not routable
          306IPv6 destination in multicast scope FFx1:/16NoticeExpected behavior - packet is not routable
          307Non-IP packetNoticeDestination MAC is the router, packet is not routable
          308Unicast destination IP but multicast destination MACErrorBad packet was received from the peer
          309Destination IP is loopback addressErrorBad packet was received from the peer
          310Source IP is multicastErrorBad packet was received from the peer
          311Source IP is in class EErrorBad packet was received from the peer
          312Source IP is loopback addressErrorBad packet was received from the peer
          313Source IP is unspecifiedErrorBad packet was received from the peer
          314Checksum or IPver or IPv4 IHL too shortErrorBad cable or bad packet was received from the peer
          315Multicast MAC mismatchErrorBad packet was received from the peer
          316Source IP equals destination IPErrorBad packet was received from the peer
          317IPv4 source IP is limited broadcastErrorBad packet was received from the peer
          318IPv4 destination IP is local network (destination=0.0.0.0/8)ErrorBad packet was received from the peer
          320Ingress router interface is disabledWarningValidate your configuration
          321Egress router interface is disabledWarningValidate your configuration
          323IPv4 routing table (LPM) unicast missWarningValidate routing table for this destination IP
          324IPv6 routing table (LPM) unicast missWarningValidate routing table for this destination IP
          325Router interface loopbackWarningValidate the interface configuration
          326Packet size is larger than router interface MTUWarningValidate the router interface MTU configuration
          327TTL value is too smallWarningActual path is longer than the TTL

          Tunnel Drop Reasons

          Reason IDReasonSeverityDescription
          402Overlay switch - Source MAC is multicastErrorThe peer sent a bad packet
          403Overlay switch - Source MAC equals destination MACErrorThe peer sent a bad packet
          404Decapsulation errorErrorThe peer sent a bad packet

          ACL Drop Reasons

          Reason IDReasonSeverityDescription
          601Ingress port ACLNoticeValidate ACL configuration
          602Ingress router ACLNoticeValidate ACL configuration
          603Egress router ACLNoticeValidate ACL configuration
          604Egress port ACLNoticeValidate ACL configuration

          Buffer Drop Reasons

          Reason IDReasonSeverityDescription
          503Tail dropWarningMonitor network congestion
          504WREDWarningMonitor network congestion
          505Port TC congestion threshold crossedNoticeMonitor network congestion
          506Packet latency threshold crossedNoticeMonitor network congestion

          gNMI Client Requests

          You can use your gNMI client on a host server to request capabilities and data that the agent is subscribed to.

          The following example shows a gNMI client request for interface speed:

          gnmi_client -target_addr 10.209.37.121:9339 -xpath "/interfaces/interface[name=swp1]/ethernet/state/port-speed" -once
          {
             "Response": {
                "Update": {
                   "update": [
                      {
                         "val": {
                            "Value": {
                               "StringVal": "SPEED_40GB"
                            }
                         },
                         "path": {
                            "elem": [
                               {
                                  "name": "state"
                               },
                               {
                                  "name": "port-speed"
                               }
                            ]
                         }
                      }
                   ],
                   "timestamp": 1636910588085654861,
                   "prefix": {
                      "target": "netq",
                      "elem": [
                         {
                            "name": "interfaces"
                         },
                         {
                            "name": "interface",
                            "key": {
                               "name": "swp1"
                            }
                         },
                         {
                            "name": "ethernet"
                         }
                      ]
                   }
                }
             }
          }
          
          
          

          The following example shows a gNMI client request for WJH drop data:

          gnmi_client -target_addr 10.209.37.121:9339 -xpath "/interfaces/interface[name=swp8]/wjh/aggregate/l2/reasons/reason[id=210]"
          {
             "Response": {
                "Update": {
                   "update": [
                      {
                         "val": {
                            "Value": {
                               "StringVal": "[{
          									  "IngressPort": "swp8",
          									  "DropType": "L2",
          									  "Reason": "Source MAC equals destination MAC",
          									  "Severity": "Error",
          									  "Smac": "00:02:10:00:00:01",
          									  "Dmac": "00:02:10:00:00:01",
          									  "Proto": 6,
          									  "Sport": 15,
          									  "Dport": 16,
          									  "Sip": "1.1.1.1"
          									  "Dip": "2.2.2.2",
          									  "AggCount": 192,
          									  "FirstTimestamp": 1636907412,
          									  "EndTimestamp": 1636907432,
          								   }]"
          
                            }
                         },
                         "path": {
                            "elem": [
                               {
                                  "name": "state"
                               },
                               {
                                  "name": "drop"
                               }
                            ]
                         }
                      }
                   ],
                   "prefix": {
                      "elem": [
                         {
                            "name": "interfaces"
                         },
                         {
                            "key": {
                               "name": "swp8"
                            },
                            "name": "interface"
                         },
                         {
                            "name": "wjh"
                         },
                         {
                            "name": "aggregate"
                         },
                         {
                            "name": "l2"
                         },
                         {
                            "name": "reasons"
                         },
                         {
                            "key" : {
                               "severity": "error",
                               "id": "210"
                            },
                            "name" : "reason"
                         }
                      ],
                      "target": "netq"
                   },
                   "timestamp": 1636907442362981645
                }
             }
          }
          

          System Event Messages Reference

          The following table lists all system event messages organized by type. You can view these messages through third-party notification applications. For details about configuring notifications for these events, refer to Configure System Event Notifications.

          Agent Events

          TypeTriggerSeverityMessage FormatExample
          agentNetQ Agent state changed to Rotten (not heard from in over 15 seconds)ErrorAgent state changed to rottenAgent state changed to rotten
          agentNetQ Agent rebootedErrorNetq-agent rebooted at (@last_boot)Netq-agent rebooted at 1573166417
          agentNode running NetQ Agent rebootedErrorSwitch rebooted at (@sys_uptime)Switch rebooted at 1573166131
          agentNetQ Agent state changed to FreshInfoAgent state changed to freshAgent state changed to fresh
          agentNetQ Agent state was resetInfoAgent state was paused and resumed at (@last_reinit)Agent state was paused and resumed at 1573166125
          agentVersion of NetQ Agent has changedInfoAgent version has been changed old_version:@old_version and new_version:@new_version. Agent reset at @sys_uptimeAgent version has been changed old_version:2.1.2 and new_version:2.3.1. Agent reset at 1573079725

          BGP Events

          TypeTriggerSeverityMessage FormatExample
          bgpBGP Session state changedErrorBGP session with peer @peer @neighbor vrf @vrf state changed from @old_state to @new_stateBGP session with peer leaf03 leaf04 vrf mgmt state changed from Established to Failed
          bgpBGP Session state changed from Failed to EstablishedInfoBGP session with peer @peer @peerhost @neighbor vrf @vrf session state changed from Failed to EstablishedBGP session with peer swp5 spine02 spine03 vrf default session state changed from Failed to Established
          bgpBGP Session state changed from Established to FailedInfoBGP session with peer @peer @neighbor vrf @vrf state changed from established to failedBGP session with peer leaf03 leaf04 vrf mgmt state changed from down to up
          bgpThe reset time for a BGP session changedInfoBGP session with peer @peer @neighbor vrf @vrf reset time changed from @old_last_reset_time to @new_last_reset_timeBGP session with peer spine03 swp9 vrf vrf2 reset time changed from 1559427694 to 1559837484

          BTRFS Events

          TypeTriggerSeverityMessage FormatExample
          btrfsinfoDisk space available after BTRFS allocation is less than 80% of partition size or only 2 GB remain.Error@info : @detailshigh btrfs allocation space : greater than 80% of partition size, 61708420
          btrfsinfoIndicates if a rebalance operation can free up space on the diskError@info : @detailsdata storage efficiency : space left after allocation greater than chunk size 6170849.2","

          Cable Events

          TypeTriggerSeverityMessage FormatExample
          cableLink speed is not the same on both ends of the linkError@ifname speed @speed, mismatched with peer @peer @peer_if speed @peer_speedswp2 speed 10, mismatched with peer server02 swp8 speed 40
          cableThe speed setting for a given port changedInfo@ifname speed changed from @old_speed to @new_speedswp9 speed changed from 10 to 40
          cableThe transceiver status for a given port changedInfo@ifname transceiver changed from @old_transceiver to @new_transceiverswp4 transceiver changed from disabled to enabled
          cableThe vendor of a given transceiver changedInfo@ifname vendor name changed from @old_vendor_name to @new_vendor_nameswp23 vendor name changed from Broadcom to NVIDIA
          cableThe part number of a given transceiver changedInfo@ifname part number changed from @old_part_number to @new_part_numberswp7 part number changed from FP1ZZ5654002A to MSN2700-CS2F0
          cableThe serial number of a given transceiver changedInfo@ifname serial number changed from @old_serial_number to @new_serial_numberswp4 serial number changed from 571254X1507020 to MT1552X12041
          cableThe status of forward error correction (FEC) support for a given port changedInfo@ifname supported fec changed from @old_supported_fec to @new_supported_fecswp12 supported fec changed from supported to unsupported

          swp12 supported fec changed from unsupported to supported

          cableThe advertised support for FEC for a given port changedInfo@ifname supported fec changed from @old_advertised_fec to @new_advertised_fecswp24 supported FEC changed from advertised to not advertised
          cableThe FEC status for a given port changedInfo@ifname fec changed from @old_fec to @new_fecswp15 fec changed from disabled to enabled

          CLAG/MLAG Events

          TypeTriggerSeverityMessage FormatExample
          clagCLAG remote peer state changed from up to downErrorPeer state changed to downPeer state changed to down
          clagLocal CLAG host MTU does not match its remote peer MTUErrorSVI @svi1 on vlan @vlan mtu @mtu1 mismatched with peer mtu @mtu2SVI svi7 on vlan 4 mtu 1592 mistmatched with peer mtu 1680
          clagCLAG SVI on VLAN is missing from remote peer stateWarningSVI on vlan @vlan is missing from peerSVI on vlan vlan4 is missing from peer
          clagCLAG peerlink is not opperating at full capacity. At least one link is down.WarningClag peerlink not at full redundancy, member link @slave is downClag peerlink not at full redundancy, member link swp40 is down
          clagCLAG remote peer state changed from down to upInfoPeer state changed to upPeer state changed to up
          clagLocal CLAG host state changed from down to upInfoClag state changed from down to upClag state changed from down to up
          clagCLAG bond in Conflicted state updated with new bondsInfoClag conflicted bond changed from @old_conflicted_bonds to @new_conflicted_bondsClag conflicted bond changed from swp7 swp8 to @swp9 swp10
          clagCLAG bond changed state from protodown to up stateInfoClag conflicted bond changed from @old_state_protodownbond to @new_state_protodownbondClag conflicted bond changed from protodown to up

          CL Support Events

          TypeTriggerSeverityMessage FormatExample
          clsupportA new CL Support file has been created for the given nodeErrorHostName @hostname has new CL SUPPORT fileHostName leaf01 has new CL SUPPORT file

          Config Diff Events

          TypeTriggerSeverityMessage FormatExample
          configdiffConfiguration file deleted on a deviceError@hostname config file @type was deletedspine03 config file /etc/frr/frr.conf was deleted
          configdiffConfiguration file has been createdInfo@hostname config file @type was createdleaf12 config file /etc/lldp.d/README.conf was created
          configdiffConfiguration file has been modifiedInfo@hostname config file @type was modifiedspine03 config file /etc/frr/frr.conf was modified

          EVPN Events

          TypeTriggerSeverityMessage FormatExample
          evpnA VNI was configured and moved from the up state to the down stateErrorVNI @vni state changed from up to downVNI 36 state changed from up to down
          evpnA VNI was configured and moved from the down state to the up stateInfoVNI @vni state changed from down to upVNI 36 state changed from down to up
          evpnThe kernel state changed on a VNIInfoVNI @vni kernel state changed from @old_in_kernel_state to @new_in_kernel_stateVNI 3 kernel state changed from down to up
          evpnA VNI state changed from not advertising all VNIs to advertising all VNIsInfoVNI @vni vni state changed from @old_adv_all_vni_state to @new_adv_all_vni_stateVNI 11 vni state changed from false to true

          Lifecycle Management Events

          TypeTriggerSeverityMessage FormatExample
          lcmCumulus Linux backup started for a switch or hostInfoCL configuration backup started for hostname @hostnameCL configuration backup started for hostname spine01
          lcmCumulus Linux backup completed for a switch or hostInfoCL configuration backup completed for hostname @hostnameCL configuration backup completed for hostname spine01
          lcmCumulus Linux backup failed for a switch or hostErrorCL configuration backup failed for hostname @hostnameCL configuration backup failed for hostname spine01
          lcmCumulus Linux upgrade from one version to a newer version has started for a switch or hostErrorCL Image upgrade from version @old_cl_version to version @new_cl_version started for hostname @hostnameCL Image upgrade from version 4.1.0 to version 4.2.1 started for hostname server01
          lcmCumulus Linux upgrade from one version to a newer version has completed successfully for a switch or hostInfoCL Image upgrade from version @old_cl_version to version @new_cl_version completed for hostname @hostnameCL Image upgrade from version 4.1.0 to version 4.2.1 completed for hostname server01
          lcmCumulus Linux upgrade from one version to a newer version has failed for a switch or hostErrorCL Image upgrade from version @old_cl_version to version @new_cl_version failed for hostname @hostnameCL Image upgrade from version 4.1.0 to version 4.2.1 failed for hostname server01
          lcmRestoration of a Cumulus Linux configuration started for a switch or hostInfoCL configuration restore started for hostname @hostnameCL configuration restore started for hostname leaf01
          lcmRestoration of a Cumulus Linux configuration completed successfully for a switch or hostInfoCL configuration restore completed for hostname @hostnameCL configuration restore completed for hostname leaf01
          lcmRestoration of a Cumulus Linux configuration failed for a switch or hostErrorCL configuration restore failed for hostname @hostnameCL configuration restore failed for hostname leaf01
          lcmRollback of a Cumulus Linux image has started for a switch or hostErrorCL Image rollback from version @old_cl_version to version @new_cl_version started for hostname @hostnameCL Image rollback from version 4.2.1 to version 4.1.0 started for hostname leaf01
          lcmRollback of a Cumulus Linux image has completed successfully for a switch or hostInfoCL Image rollback from version @old_cl_version to version @new_cl_version completed for hostname @hostnameCL Image rollback from version 4.2.1 to version 4.1.0 completed for hostname leaf01
          lcmRollback of a Cumulus Linux image has failed for a switch or hostErrorCL Image rollback from version @old_cl_version to version @new_cl_version failed for hostname @hostnameCL Image rollback from version 4.2.1 to version 4.1.0 failed for hostname leaf01
          lcmInstallation of a NetQ image has started for a switch or hostInfoNetQ Image version @netq_version installation started for hostname @hostnameNetQ Image version 3.2.0 installation started for hostname spine02
          lcmInstallation of a NetQ image has completed successfully for a switch or hostInfoNetQ Image version @netq_version installation completed for hostname @hostnameNetQ Image version 3.2.0 installation completed for hostname spine02
          lcmInstallation of a NetQ image has failed for a switch or hostErrorNetQ Image version @netq_version installation failed for hostname @hostnameNetQ Image version 3.2.0 installation failed for hostname spine02
          lcmUpgrade of a NetQ image has started for a switch or hostInfoNetQ Image upgrade from version @old_netq_version to version @netq_version started for hostname @hostnameNetQ Image upgrade from version 3.1.0 to version 3.2.0 started for hostname spine02
          lcmUpgrade of a NetQ image has completed successfully for a switch or hostInfoNetQ Image upgrade from version @old_netq_version to version @netq_version completed for hostname @hostnameNetQ Image upgrade from version 3.1.0 to version 3.2.0 completed for hostname spine02
          lcmUpgrade of a NetQ image has failed for a switch or hostErrorNetQ Image upgrade from version @old_netq_version to version @netq_version failed for hostname @hostnameNetQ Image upgrade from version 3.1.0 to version 3.2.0 failed for hostname spine02
          TypeTriggerSeverityMessage FormatExample
          linkLink operational state changed from up to downErrorHostName @hostname changed state from @old_state to @new_state Interface:@ifnameHostName leaf01 changed state from up to down Interface:swp34
          linkLink operational state changed from down to upInfoHostName @hostname changed state from @old_state to @new_state Interface:@ifnameHostName leaf04 changed state from down to up Interface:swp11

          LLDP Events

          TypeTriggerSeverityMessage FormatExample
          lldpLocal LLDP host has new neighbor informationInfoLLDP Session with host @hostname and @ifname modified fields @changed_fieldsLLDP Session with host leaf02 swp6 modified fields leaf06 swp21
          lldpLocal LLDP host has new peer interface nameInfoLLDP Session with host @hostname and @ifname @old_peer_ifname changed to @new_peer_ifnameLLDP Session with host spine01 and swp5 swp12 changed to port12
          lldpLocal LLDP host has new peer hostnameInfoLLDP Session with host @hostname and @ifname @old_peer_hostname changed to @new_peer_hostnameLLDP Session with host leaf03 and swp2 leaf07 changed to exit01

          MTU Events

          TypeTriggerSeverityMessage FormatExample
          mtuVLAN interface link MTU is smaller than that of its parent MTUWarningvlan interface @link mtu @mtu is smaller than parent @parent mtu @parent_mtuvlan interface swp3 mtu 1500 is smaller than parent peerlink-1 mtu 1690
          mtuBridge interface MTU is smaller than the member interface with the smallest MTUWarningbridge @link mtu @mtu is smaller than least of member interface mtu @minbridge swp0 mtu 1280 is smaller than least of member interface mtu 1500

          NTP Events

          TypeTriggerSeverityMessage FormatExample
          ntpNTP sync state changed from in sync to not in syncErrorSync state changed from @old_state to @new_state for @hostnameSync state changed from in sync to not sync for leaf06
          ntpNTP sync state changed from not in sync to in syncInfoSync state changed from @old_state to @new_state for @hostnameSync state changed from not sync to in sync for leaf06

          OSPF Events

          TypeTriggerSeverityMessage FormatExample
          ospfOSPF session state on a given interface changed from Full to a down stateErrorOSPF session @ifname with @peer_address changed from Full to @down_state

          OSPF session swp7 with 27.0.0.18 state changed from Full to Fail

          OSPF session swp7 with 27.0.0.18 state changed from Full to ExStart

          ospfOSPF session state on a given interface changed from a down state to fullInfoOSPF session @ifname with @peer_address changed from @down_state to Full

          OSPF session swp7 with 27.0.0.18 state changed from Down to Full

          OSPF session swp7 with 27.0.0.18 state changed from Init to Full

          OSPF session swp7 with 27.0.0.18 state changed from Fail to Full

          Package Information Events

          TypeTriggerSeverityMessage FormatExample
          packageinfoPackage version on device does not match the version identified in the existing manifestError@package_name manifest version mismatchnetq-apps manifest version mismatch

          PTM Events

          TypeTriggerSeverityMessage FormatExample
          ptmPhysical interface cabling does not match configuration specified in topology.dot fileErrorPTM cable status failedPTM cable status failed
          ptmPhysical interface cabling matches configuration specified in topology.dot fileErrorPTM cable status passedPTM cable status passed

          Resource Events

          TypeTriggerSeverityMessage FormatExample
          resourceA physical resource has been deleted from a deviceErrorResource Utils deleted for @hostnameResource Utils deleted for spine02
          resourceRoot file system access on a device has changed from Read/Write to Read OnlyError@hostname root file system access mode set to Read Onlyserver03 root file system access mode set to Read Only
          resourceRoot file system access on a device has changed from Read Only to Read/WriteInfo@hostname root file system access mode set to Read/Writeleaf11 root file system access mode set to Read/Write
          resourceA physical resource has been added to a deviceInfoResource Utils added for @hostnameResource Utils added for spine04

          Running Config Diff Events

          TypeTriggerSeverityMessage FormatExample
          runningconfigdiffRunning configuration file has been modifiedInfo@commandname config result was modified@commandname config result was modified

          Sensor Events

          TypeTriggerSeverityMessage FormatExample
          sensorA fan or power supply unit sensor has changed stateErrorSensor @sensor state changed from @old_s_state to @new_s_stateSensor fan state changed from up to down
          sensorA temperature sensor has crossed the maximum threshold for that sensorErrorSensor @sensor max value @new_s_max exceeds threshold @new_s_critSensor temp max value 110 exceeds the threshold 95
          sensorA temperature sensor has crossed the minimum threshold for that sensorErrorSensor @sensor min value @new_s_lcrit fall behind threshold @new_s_minSensor psu min value 10 fell below threshold 25
          sensorA temperature, fan, or power supply sensor state changedInfoSensor @sensor state changed from @old_state to @new_state

          Sensor temperature state changed from Error to ok

          Sensor fan state changed from absent to ok

          Sensor psu state changed from bad to ok

          sensorA fan or power supply sensor state changedInfoSensor @sensor state changed from @old_s_state to @new_s_state

          Sensor fan state changed from down to up

          Sensor psu state changed from down to up

          Services Events

          TypeTriggerSeverityMessage FormatExample
          servicesA service status changed from down to upErrorService @name status changed from @old_status to @new_statusService bgp status changed from down to up
          servicesA service status changed from up to downErrorService @name status changed from @old_status to @new_statusService lldp status changed from up to down
          servicesA service changed state from inactive to activeInfoService @name changed state from inactive to active

          Service bgp changed state from inactive to active

          Service lldp changed state from inactive to active

          SSD Utilization Events

          TypeTriggerSeverityMessage FormatExample
          ssdutil3ME3 disk health has dropped below 10%Error@info: @detailslow health : 5.0%
          ssdutilA dip in 3ME3 disk health of more than 2% has occurred within the last 24 hoursError@info: @detailssignificant health drop : 3.0%

          Version Events

          TypeTriggerSeverityMessage FormatExample
          versionAn unknown version of the operating system was detectedErrorunexpected os version @my_verunexpected os version cl3.2
          versionDesired version of the operating system is not availableErroros version @veros version cl3.7.9
          versionAn unknown version of a software package was detectedErrorexpected release version @verexpected release version cl3.6.2
          versionDesired version of a software package is not availableErrordifferent from version @verdifferent from version cl4.0

          VXLAN Events

          TypeTriggerSeverityMessage FormatExample
          vxlanReplication list is contains an inconsistent set of nodes<>Error<>VNI @vni replication list inconsistent with @conflicts diff:@diff<>VNI 14 replication list inconsistent with ["leaf03","leaf04"] diff:+:["leaf03","leaf04"] -:["leaf07","leaf08"]

          TCA Event Messages Reference

          This reference lists the threshold-based events that NetQ supports for ACL resources, digital optics, forwarding resources, interface errors and statistics, link flaps, resource utilization, sensors, and What Just Happened. You can view these messages through third-party notification applications. For details about configuring notifications for these events, refer to Configure Threshold-Based Event Notifications.

          ACL Resources

          NetQ UI NameNetQ CLI Event IDDescription
          Ingress ACL IPv4 %TCA_TCAM_IN_ACL_V4_FILTER_UPPERNumber of ingress ACL filters for IPv4 addresses on a given switch or host exceeded user-defined threshold
          Egress ACL IPv4 %TCA_TCAM_EG_ACL_V4_FILTER_UPPERNumber of egress ACL filters for IPv4 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL IPv4 Mangle %TCA_TCAM_IN_ACL_V4_MANGLE_UPPERNumber of ingress ACL mangles for IPv4 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL IPv4 Mangle %TCA_TCAM_EG_ACL_V4_MANGLE_UPPERNumber of egress ACL mangles for IPv4 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL IPv6 %TCA_TCAM_IN_ACL_V6_FILTER_UPPERNumber of ingress ACL filters for IPv6 addresses on a given switch or host exceeded user-defined maximum threshold
          Egress ACL IPv6 %TCA_TCAM_EG_ACL_V6_FILTER_UPPERNumber of egress ACL filters for IPv6 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL IPv6 Mangle %TCA_TCAM_IN_ACL_V6_MANGLE_UPPERNumber of ingress ACL mangles for IPv6 addresses on a given switch or host exceeded user-defined maximum threshold
          Egress ACL IPv6 Mangle %TCA_TCAM_EG_ACL_V6_MANGLE_UPPERNumber of egress ACL mangles for IPv6 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL 8021x %TCA_TCAM_IN_ACL_8021x_FILTER_UPPERNumber of ingress ACL 802.1 filters on a given switch or host exceeded user-defined maximum threshold
          ACL L4 port %TCA_TCAM_ACL_L4_PORT_CHECKERS_UPPERNumber of ACL port range checkers on a given switch or host exceeded user-defined maximum threshold
          ACL Regions %TCA_TCAM_ACL_REGIONS_UPPERNumber of ACL regions on a given switch or host exceeded user-defined maximum threshold
          Ingress ACL Mirror %TCA_TCAM_IN_ACL_MIRROR_UPPERNumber of ingress ACL mirrors on a given switch or host exceeded user-defined maximum threshold
          ACL 18B Rules %TCA_TCAM_ACL_18B_RULES_UPPERNumber of ACL 18B rules on a given switch or host exceeded user-defined maximum threshold
          ACL 32B %TCA_TCAM_ACL_32B_RULES_UPPERNumber of ACL 32B rules on a given switch or host exceeded user-defined maximum threshold
          ACL 54B %TCA_TCAM_ACL_54B_RULES_UPPERNumber of ACL 54B rules on a given switch or host exceeded user-defined maximum threshold
          Ingress PBR IPv4 %TCA_TCAM_IN_PBR_V4_FILTER_UPPERNumber of ingress policy-based routing (PBR) filters for IPv4 addresses on a given switch or host exceeded user-defined maximum threshold
          Ingress PBR IPv6 %TCA_TCAM_IN_PBR_V6_FILTER_UPPERNumber of ingress policy-based routing (PBR) filters for IPv6 addresses on a given switch or host exceeded user-defined maximum threshold

          Digital Optics

          Some of the event IDs have changed. If you have TCA rules configured for digital optics for a NetQ 3.1.0 deployment or earlier, verify that they are using the correct event IDs. You might need to remove and recreate some of the events.

          NetQ UI NameNetQ CLI Event IDDescription
          Laser RX Power Alarm UpperTCA_DOM_RX_POWER_ALARM_UPPERTransceiver Input power (mW) for the digital optical module on a given switch or host interface exceeded user-defined the maximum alarm threshold
          Laser RX Power Alarm LowerTCA_DOM_RX_POWER_ALARM_LOWERTransceiver Input power (mW) for the digital optical module on a given switch or host exceeded user-defined minimum alarm threshold
          Laser RX Power Warning UpperTCA_DOM_RX_POWER_WARNING_UPPERTransceiver Input power (mW) for the digital optical module on a given switch or host exceeded user-defined specified warning threshold
          Laser RX Power Warning LowerTCA_DOM_RX_POWER_WARNING_LOWERTransceiver Input power (mW) for the digital optical module on a given switch or host exceeded user-defined minimum warning threshold
          Laser Bias Current Alarm UpperTCA_DOM_BIAS_CURRENT_ALARM_UPPERLaser bias current (mA) for the digital optical module on a given switch or host exceeded user-defined maximum alarm threshold
          Laser Bias Current Alarm LowerTCA_DOM_BIAS__CURRENT_ALARM_LOWERLaser bias current (mA) for the digital optical module on a given switch or host exceeded user-defined minimum alarm threshold
          Laser Bias Current Warning UpperTCA_DOM_BIAS_CURRENT_WARNING_UPPERLaser bias current (mA) for the digital optical module on a given switch or host exceeded user-defined maximum warning threshold
          Laser Bias Current Warning LowerTCA_DOM_BIAS__CURRENT_WARNING_LOWERLaser bias current (mA) for the digital optical module on a given switch or host exceeded user-defined minimum warning threshold
          Laser Output Power Alarm UpperTCA_DOM_OUTPUT_POWER_ALARM_UPPERLaser output power (mW) for the digital optical module on a given switch or host exceeded user-defined maximum alarm threshold
          Laser Output Power Alarm LowerTCA_DOM_OUTPUT_POWER_ALARM_LOWERLaser output power (mW) for the digital optical module on a given switch or host exceeded user-defined minimum alarm threshold
          Laser Output Power Alarm UpperTCA_DOM_OUTPUT_POWER_WARNING_UPPERLaser output power (mW) for the digital optical module on a given switch or host exceeded user-defined maximum warning threshold
          Laser Output Power Warning LowerTCA_DOM_OUTPUT_POWER_WARNING_LOWERLaser output power (mW) for the digital optical module on a given switch or host exceeded user-defined minimum warning threshold
          Laser Module Temperature Alarm UpperTCA_DOM_MODULE_TEMPERATURE_ALARM_UPPERDigital optical module temperature (°C) on a given switch or host exceeded user-defined maximum alarm threshold
          Laser Module Temperature Alarm LowerTCA_DOM_MODULE_TEMPERATURE_ALARM_LOWERDigital optical module temperature (°C) on a given switch or host exceeded user-defined minimum alarm threshold
          Laser Module Temperature Warning UpperTCA_DOM_MODULE_TEMPERATURE_WARNING_UPPERDigital optical module temperature (°C) on a given switch or host exceeded user-defined maximum warning threshold
          Laser Module Temperature Warning LowerTCA_DOM_MODULE_TEMPERATURE_WARNING_LOWERDigital optical module temperature (°C) on a given switch or host exceeded user-defined minimum warning threshold
          Laser Module Voltage Alarm UpperTCA_DOM_MODULE_VOLTAGE_ALARM_UPPERTransceiver voltage (V) on a given switch or host exceeded user-defined maximum alarm threshold
          Laser Module Voltage Alarm LowerTCA_DOM_MODULE_VOLTAGE_ALARM_LOWERTransceiver voltage (V) on a given switch or host exceeded user-defined minimum alarm threshold
          Laser Module Voltage Warning UpperTCA_DOM_MODULE_VOLTAGE_WARNING_UPPERTransceiver voltage (V) on a given switch or host exceeded user-defined maximum warning threshold
          Laser Module Voltage Warning LowerTCA_DOM_MODULE_VOLTAGE_WARNING_LOWERTransceiver voltage (V) on a given switch or host exceeded user-defined minimum warning threshold

          Forwarding Resources

          NetQ UI NameNetQ CLI Event IDDescription
          Total Route Entries %TCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPERNumber of routes on a given switch or host exceeded user-defined maximum threshold
          Mcast Routes %TCA_TCAM_TOTAL_MCAST_ROUTES_UPPERNumber of multicast routes on a given switch or host exceeded user-defined maximum threshold
          MAC entries %TCA_TCAM_MAC_ENTRIES_UPPERNumber of MAC addresses on a given switch or host exceeded user-defined maximum threshold
          IPv4 Routes %TCA_TCAM_IPV4_ROUTE_UPPERNumber of IPv4 routes on a given switch or host exceeded user-defined maximum threshold
          IPv4 Hosts %TCA_TCAM_IPV4_HOST_UPPERNumber of IPv4 hosts on a given switch or host exceeded user-defined maximum threshold
          Exceeding IPV6 Routes %TCA_TCAM_IPV6_ROUTE_UPPERNumber of IPv6 routes on a given switch or host exceeded user-defined maximum threshold
          IPv6 Hosts %TCA_TCAM_IPV6_HOST_UPPERNumber of IPv6 hosts on a given switch or host exceeded user-defined maximum threshold
          ECMP Next Hop %TCA_TCAM_ECMP_NEXTHOPS_UPPERNumber of equal cost multi-path (ECMP) next hop entries on a given switch or host exceeded user-defined maximum threshold

          Interface Errors

          NetQ UI NameNetQ CLI Event IDDescription
          Oversize ErrorsTCA_HW_IF_OVERSIZE_ERRORSNumber of times a frame longer than maximum size (1518 Bytes) exceeded user-defined threshold
          Undersize ErrorsTCA_HW_IF_UNDERSIZE_ERRORSNumber of times a frame shorter than minimum size (64 Bytes) exceeded user-defined threshold
          Alignment ErrorsTCA_HW_IF_ALIGNMENT_ERRORSNumber of times a frame with an uneven byte count and a CRC error exceeded user-defined threshold
          Jabber ErrorsTCA_HW_IF_JABBER_ERRORSNumber of times a frame longer than maximum size (1518 bytes) and with a CRC error exceeded user-defined threshold
          Symbol ErrorsTCA_HW_IF_SYMBOL_ERRORSNumber of times that detected undefined or invalid symbols exceeded user-defined threshold

          Interface Statistics

          NetQ UI NameNetQ CLI Event IDDescriptionExample Message
          Broadcast Received BytesTCA_RXBROADCAST_UPPERNumber of broadcast receive bytes per second exceeded user-defined maximum threshold on a switch interfaceRX broadcast upper threshold breached for host leaf04 ifname:swp45 value: 40200
          Received BytesTCA_RXBYTES_UPPERNumber of receive bytes exceeded user-defined maximum threshold on a switch interfaceRX bytes upper threshold breached for host spine02 ifname:swp4 value: 20000
          Multicast Received BytesTCA_RXMULTICAST_UPPERrx_multicast per second on a given switch or host exceeded user-defined maximum threshold
          Broadcast Transmitted BytesTCA_TXBROADCAST_UPPERNumber of broadcast transmit bytes per second exceeded user-defined maximum threshold on a switch interfaceTX broadcast upper threshold breached for host leaf04 ifname:swp45 value: 40200
          Transmitted BytesTCA_TXBYTES_UPPERNumber of transmit bytes exceeded user-defined maximum threshold on a switch interfaceTX bytes upper threshold breached for host spine02 ifname:swp4 value: 20000
          Multicast Transmitted BytesTCA_TXMULTICAST_UPPERNumber of multicast transmit bytes per second exceeded user-defined maximum threshold on a switch interfaceTX multicast upper threshold breached for host leaf04 ifname:swp45 value: 30000
          NetQ UI NameNetQ CLI Event IDDescription
          Link flap errorsTCA_LINKNumber of link flaps user-defined maximum threshold

          Resource Utilization

          NetQ UI NameNetQ CLI Event IDDescriptionExample Message
          CPU UtilizationTCA_CPU_UTILIZATION_UPPERPercentage of CPU utilization exceeded user-defined maximum threshold on a switch or hostCPU Utilization for host leaf11 exceed configured mark 85
          Disk UtilizationTCA_DISK_UTILIZATION_UPPERPercentage of disk utilization exceeded user-defined maximum threshold on a switch or hostDisk Utilization for host leaf11 exceed configured mark 90
          Memory UtilizationTCA_MEMORY_UTILIZATION_UPPERPercentage of memory utilization exceeded user-defined maximum threshold on a switch or hostMemory Utilization for host leaf11 exceed configured mark 95

          RoCE

          NetQ UI NameNetQ CLI Event IDDescription
          Rx CNP Buffer Usage CellsTCA_RX_CNP_BUFFER_USAGE_CELLSPercentage of Rx General+CNP buffer usage exceeded user-defined maximum threshold on a switch interface
          Rx CNP No Buffer DiscardTCA_RX_CNP_NO_BUFFER_DISCARDRate of Rx General+CNP no buffer discard exceeded user-defined maximum threshold on a switch interface
          Rx CNP PG Usage CellsTCA_RX_CNP_PG_USAGE_CELLSPercentage of Rx General+CNP PG usage exceeded user-defined maximum threshold on a switch interface
          Rx RoCE Buffer Usage CellsTCA_RX_ROCE_BUFFER_USAGE_CELLSPercentage of Rx RoCE buffer usage exceeded user-defined maximum threshold on a switch interface
          Rx RoCE No Buffer DiscardTCA_RX_ROCE_NO_BUFFER_DISCARDRate of Rx RoCE no buffer discard exceeded user-defined maximum threshold on a switch interface
          Rx RoCE PG Usage CellsTCA_RX_ROCE_PG_USAGE_CELLSPercentage of Rx RoCE PG usage exceeded user-defined maximum threshold on a switch interface
          Rx RoCE PFC Pause DurationTCA_RX_ROCE_PFC_PAUSE_DURATIONNumber of Rx RoCE PFC pause duration exceeded user-defined maximum threshold on a switch interface
          Rx RoCE PFC Pause PacketsTCA_RX_ROCE_PFC_PAUSE_PACKETSRate of Rx RoCE PFC pause packets exceeded user-defined maximum threshold on a switch interface
          Tx CNP Buffer Usage CellsTCA_TX_CNP_BUFFER_USAGE_CELLSPercentage of Tx General+CNP buffer usage exceeded user-defined maximum threshold on a switch interface
          Tx CNP TC Usage CellsTCA_TX_CNP_TC_USAGE_CELLSPercentage of Tx CNP TC usage exceeded user-defined maximum threshold on a switch interface
          Tx CNP Unicast No Buffer DiscardTCA_TX_CNP_UNICAST_NO_BUFFER_DISCARDRate of Tx CNP unicast no buffer discard exceeded user-defined maximum threshold on a switch interface
          Tx ECN Marked PacketsTCA_TX_ECN_MARKED_PACKETSRate of Tx Port ECN marked packets exceeded user-defined maximum threshold on a switch interface
          Tx RoCE Buffer Usage CellsTCA_TX_ROCE_BUFFER_USAGE_CELLSPercentage of Tx RoCE buffer usage exceeded user-defined maximum threshold on a switch interface
          Tx RoCE PFC Pause DurationTCA_TX_ROCE_PFC_PAUSE_DURATIONNumber of Tx RoCE PFC pause duration exceeded user-defined maximum threshold on a switch interface
          Tx RoCE PFC Pause PacketsTCA_TX_ROCE_PFC_PAUSE_PACKETSRate of Tx RoCE PFC pause packets exceeded user-defined maximum threshold on a switch interface
          Tx RoCE TC Usage CellsTCA_TX_ROCE_TC_USAGE_CELLSPercentage of Tx RoCE TC usage exceeded user-defined maximum threshold on a switch interface
          Tx RoCE Unicast No Buffer DiscardTCA_TX_ROCE_UNICAST_NO_BUFFER_DISCARDRate of Tx RoCE unicast no buffer discard exceeded user-defined maximum threshold on a switch interface

          Sensors

          NetQ UI NameNetQ CLI Event IDDescriptionExample Message
          Fan SpeedTCA_SENSOR_FAN_UPPERFan speed exceeded user-defined maximum threshold on a switchSensor for spine03 exceeded threshold fan speed 700 for sensor fan2
          Power Supply WattsTCA_SENSOR_POWER_UPPERPower supply output exceeded user-defined maximum threshold on a switchSensor for leaf14 exceeded threshold power 120 watts for sensor psu1
          Power Supply VoltsTCA_SENSOR_VOLTAGE_UPPERPower supply voltage exceeded user-defined maximum threshold on a switchSensor for leaf14 exceeded threshold voltage 12 volts for sensor psu2
          Switch TemperatureTCA_SENSOR_TEMPERATURE_UPPERTemperature (° C) exceeded user-defined maximum threshold on a switchSensor for leaf14 exceeded threshold temperature 90 for sensor temp1

          What Just Happened

          NetQ UI NameNetQ CLI Event IDDrop TypeReason/Port Down ReasonDescription
          ACL Drop Aggregate UpperTCA_WJH_ACL_DROP_AGG_UPPERACLEgress port ACLACL action set to deny on the physical egress port or bond
          ACL Drop Aggregate UpperTCA_WJH_ACL_DROP_AGG_UPPERACLEgress router ACLACL action set to deny on the egress switch virtual interfaces (SVIs)
          ACL Drop Aggregate UpperTCA_WJH_ACL_DROP_AGG_UPPERACLIngress port ACLACL action set to deny on the physical ingress port or bond
          ACL Drop Aggregate UpperTCA_WJH_ACL_DROP_AGG_UPPERACLIngress router ACLACL action set to deny on the ingress switch virtual interfaces (SVIs)
          Buffer Drop Aggregate UpperTCA_WJH_BUFFER_DROP_AGG_UPPERBufferPacket Latency Threshold CrossedTime a packet spent within the switch exceeded or dropped below the specified high or low threshold
          Buffer Drop Aggregate UpperTCA_WJH_BUFFER_DROP_AGG_UPPERBufferPort TC Congestion Threshold CrossedPercentage of the occupancy buffer exceeded or dropped below the specified high or low threshold
          Buffer Drop Aggregate UpperTCA_WJH_BUFFER_DROP_AGG_UPPERBufferTail dropTail drop is enabled, and buffer queue is filled to maximum capacity
          Buffer Drop Aggregate UpperTCA_WJH_BUFFER_DROP_AGG_UPPERBufferWREDWeighted Random Early Detection is enabled, and buffer queue is filled to maximum capacity or the RED engine dropped the packet as of random congestion prevention
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Auto-negotiation failureNegotiation of port speed with peer has failed
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Bad signal integrityIntegrity of the signal on port is not sufficient for good communication
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Cable/transceiver is not supportedThe attached cable or transceiver is not supported by this port
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Cable/transceiver is unpluggedA cable or transceiver is missing or not fully inserted into the port
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Calibration failureCalibration failure
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Link training failureLink is not able to go operational up due to link training failure
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Peer is sending remote faultsPeer node is not operating correctly
          CRC Error UpperTCA_WJH_CRC_ERROR_UPPERL1Port admin downPort has been purposely set down by user
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Destination MAC is reserved (DMAC=01-80-C2-00-00-0x)The address cannot be used by this link
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Ingress spanning tree filterPort is in Spanning Tree blocking state
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Ingress VLAN filteringFrames whose port is not a member of the VLAN are discarded
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2MLAG port isolationNot supported for port isolation implemented with system ACL
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Multicast egress port list is emptyNo ports are defined for multicast egress
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Port loopback filterPort is operating in loopback mode; packets are being sent to itself (source MAC address is the same as the destination MAC address
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2Unicast MAC table action discardCurrently not supported
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERL2VLAN tagging mismatchVLAN tags on the source and destination do not match
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterBlackhole ARP/neighborPacket received with blackhole adjacency
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterBlackhole routePacket received with action equal to discard
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterChecksum or IPver or IPv4 IHL too shortCannot read packet due to header checksum error, IP version mismatch, or IPv4 header length is too short
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterDestination IP is loopback addressCannot read packet as destination IP address is a loopback address (dip=>127.0.0.0/8)
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterEgress router interface is disabledPacket destined to a different subnet cannot be routed because egress router interface is disabled
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIngress router interface is disabledPacket destined to a different subnet cannot be routed because ingress router interface is disabled
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv4 destination IP is link localPacket has IPv4 destination address that is a local link (destination in 169.254.0.0/16)
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv4 destination IP is local network (destination=0.0.0.0/8)Packet has IPv4 destination address that is a local network (destination=0.0.0.0/8)
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv4 routing table (LPM) unicast missNo route available in routing table for packet
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv4 source IP is limited broadcastPacket has broadcast source IP address
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv6 destination in multicast scope FFx0:/16Packet received with multicast destination address in FFx0:/16 address range
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv6 destination in multicast scope FFx1:/16Packet received with multicast destination address in FFx1:/16 address range
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterIPv6 routing table (LPM) unicast missNo route available in routing table for packet
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterMulticast MAC mismatchFor IPv4, destination MAC address is not equal to {0x01-00-5E-0 (25 bits), DIP[22:0]} and DIP is multicast. For IPv6, destination MAC address is not equal to {0x3333, DIP[31:0]} and DIP is multicast
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterNon IP packetCannot read packet header because it is not an IP packet
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterNon-routable packetPacket has no route in routing table
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterPacket size is larger than router interface MTUPacket has larger MTU configured than the VLAN
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterRouter interface loopbackPacket has destination IP address that is local. For example, SIP = 1.1.1.1, DIP = 1.1.1.128.
          Drop Aggregate UpperTCA_WJH_DROP_AGG_UPPERRouterSource IP equals destination IPPacket has a source IP address