NVIDIA® NetQ™ is a highly scalable, modern network operations tool set that provides visibility and troubleshooting of your overlay and underlay networks in real-time. NetQ delivers actionable insights and operational intelligence about the health of your data center—from the container, virtual machine, or host, all the way to the switch and port. NetQ correlates configuration and operational status, and instantly identifies and tracks state changes while simplifying management for the entire Linux-based data center. With NetQ, network operations change from a manual, reactive, node-by-node approach to an automated, informed, and agile one. Visit Network Operations and NetQ to learn more.
This user guide provides in-depth documentation for network administrators who are responsible for deploying, configuring, monitoring, and troubleshooting the network in their data center or campus environment.
For a list of the new features in this release, see What's New. For bug fixes and known issues present in this release, refer to the release notes.
What's New
This page summarizes new feature and improvements for the NetQ 4.3 release. For a complete list of open and fixed issues, see the release notes.
What’s New in NetQ 4.3.0
This release includes several performance and infrastructure improvements that make NetQ faster and more reliable. Additional features and improvements include:
SSO configuration that lets administrators add user accounts more efficiently.
You can upgrade to NetQ 4.3.0 directly from versions 4.0.0 or later. Upgrades from NetQ v3 releases require a fresh installation or an incremental upgrade to version 4.0.0 first.
Compatible Agent Versions
NetQ 4.3.0 is compatible with NetQ Agent versions 4.2.0 and above. You can install NetQ Agents on switches and servers running:
Cumulus Linux 3.7.12 and later
SONiC 202012 to 202106
CentOS 7
RHEL 7.1
Ubuntu 18.04
NetQ CLI Changes
Modified Commands
The following table summarizes the commands that have changed with this release.
Changed name <text-job-name> to job-name <text-job-name>.
4.3.0
netq install standalone full netq install cluster full netq install opta standalone full netq install opta cluster full
Added pod-ip-range <text-pod-ip-range> option to specify a range of IP addresses for the pod.
4.3.0
NetQ Overview
NetQ is a highly scalable, modern network operations tool set that provides visibility and troubleshooting of your overlay and underlay networks in real-time. NetQ delivers actionable insights and operational intelligence about the health of your data center—from the container, virtual machine, or host, all the way to the switch and port. NetQ correlates configuration and operational status, and instantly identifies and tracks state changes while simplifying management for the entire Linux-based data center. With NetQ, network operations change from a manual, reactive, node-by-node approach to an automated, informed, and agile one.
NetQ performs three primary functions:
Data collection: real-time and historical telemetry and network state information
Data analytics: deep processing of the data
Data visualization: rich graphical user interface (GUI) for actionable insight
NetQ is available as an on-site or in-cloud deployment.
Unlike other network operations tools, NetQ delivers significant operational improvements to your network management and maintenance processes. It simplifies the data center network by reducing the complexity through real-time visibility into hardware and software status and eliminating the guesswork associated with investigating issues through the analysis and presentation of detailed, focused data.
Demystify Overlay Networks
While overlay networks provide significant advantages in network management, it can be difficult to troubleshoot issues that occur in the overlay one node at a time. You are unable to correlate which events (configuration changes, power outages, and so forth) might have caused problems in the network and when they occurred. Only a sampling of data is available to use for your analysis. In contrast, with NetQ deployed, you have a networkwide view of the overlay network, can correlate events with what is happening now or in the past, and have real-time data to fill out the complete picture of your network health and operation.
In summary:
Without NetQ
With NetQ
Difficult to debug overlay network
View networkwide status of overlay network
Hard to find out what happened in the past
View historical activity with time-machine view
Periodically sampled data
Real-time collection of telemetry data for a more complete data set
Protect Network Integrity with NetQ Validation
Network configuration changes can contribute to the creation of many trouble tickets—you cannot test a new configuration before deploying it. When the tickets start pouring in, you are stuck with a large amount of data that is collected and stored in multiple tools, making correlation of the events to the resolution required difficult at best. Isolating faults in the past is challenging. In contrast, with NetQ deployed, you can proactively verify inconsistencies due to configuration changes and catch misconfigurations before deployment. Additionally, historical data is readily available to correlate past events with current issues.
In summary:
Without NetQ
With NetQ
Reactive to trouble tickets
Catch inconsistencies and misconfigurations before deployment with integrity checks/validation
Large amount of data and multiple tools to correlate the logs/events with the issues
Correlate network status, all in one place
Periodically sampled data
Readily available historical data for viewing and correlating changes in the past with current issues
Troubleshoot Issues Across the Network
Troubleshooting networks is challenging in the best of times, but trying to do so manually, one node at a time, and digging through a series of long and ugly logs make the job harder than it needs to be. NetQ provides rolled up and correlated network status on a regular basis, enabling you to get down to the root of the problem quickly, whether it occurred recently or over a week ago. The graphical user interface helps you visualize problems so you can address them quickly.
In summary:
Without NetQ
With NetQ
Large amount of data and multiple tools to correlate the logs/events with the issues
Rolled up and correlated network status, view events and status together
Past events are lost
Historical data gathered and stored for comparison with current network state
Manual, node-by-node troubleshooting
View issues on all devices all at one time, pointing to the source of the problem
Track Connectivity with NetQ Trace
Conventional trace only traverses the data path looking for problems, and does so on a node-to-node basis. In large networks, this process can become very time consuming. NetQ verifies both the data and control paths, providing additional information. It discovers misconfigurations along all hops in one go, allowing you to resolve them quickly.
In summary:
Without NetQ
With NetQ
Trace covers only data path; hard to check control path
Verifies both data and control paths
View portion of entire path
View all paths between devices simultaneously to find problem paths
Node-to-node check on misconfigurations
View any misconfigurations along all hops from source to destination
NetQ Basics
This section provides an overview of the NetQ hardware, software, and deployment models.
NetQ Components
NetQ contains the following applications and key components:
Telemetry data collection and aggregation via
NetQ switch agents
NetQ host agents
Database
Data streaming
Network services
User interfaces
While these functions apply to both the on-premises and cloud solutions, they are configured differently, as shown in the following diagrams.
NetQ Agents
NetQ Agents are installed via software and run on every monitored node in the network—including Cumulus® Linux® switches, Linux bare metal hosts, and virtual machines. The NetQ Agents push network data regularly and event information immediately to the NetQ Platform.
Switch Agents
The NetQ Agents running on Cumulus Linux or SONiC switches gather the following network data via Netlink:
Interfaces
IP addresses (v4 and v6)
IP routes (v4 and v6)
Links
Bridge FDB (MAC address table)
ARP Entries/Neighbors (IPv4 and IPv6)
for the following protocols:
Bridging protocols: LLDP, STP, MLAG
Routing protocols: BGP, OSPF
Network virtualization: EVPN, VXLAN
The NetQ Agent is supported on Cumulus Linux 3.7.12 and later and SONiC 202012 and later.
Host Agents
The NetQ Agents running on hosts gather the same information as that for switches, plus the following network data:
Network IP and MAC addresses
Container IP and MAC addresses
The NetQ Agent obtains container information by listening to the Kubernetes orchestration tool.
The NetQ Agent is supported on hosts running Ubuntu 16.04, Red Hat® Enterprise Linux 7, and CentOS 7 Operating Systems.
NetQ Core
The NetQ core performs the data collection, storage, and processing for delivery to various user interfaces. It consists of a collection of scalable components running entirely within a single server. The NetQ software queries this server, rather than individual devices, enabling greater system scalability. Each of these components is described briefly below.
Data Aggregation
The data aggregation component collects data coming from all of the NetQ Agents. It then filters, compresses, and forwards the data to the streaming component. The server monitors for missing messages and also monitors the NetQ Agents themselves, sending notifications about events when appropriate. In addition to the telemetry data collected from the NetQ Agents, the aggregation component collects information from the switches and hosts, such as vendor, model, version, and basic operational state.
Data Stores
NetQ uses two types of data stores. The first stores the raw data, data aggregations, and discrete events needed for quick response to data requests. The second stores data based on correlations, transformations, and raw-data processing.
Real-time Streaming
The streaming component processes the incoming raw data from the aggregation server in real time. It reads the metrics and stores them as a time series, and triggers alarms based on anomaly detection, thresholds, and events.
Network Services
The network services component monitors protocols and services operation individually and on a networkwide basis and stores status details.
User Interfaces
NetQ data is available through several interfaces:
NetQ CLI (command line interface)
NetQ UI (graphical user interface)
NetQ RESTful API (representational state transfer application programming interface)
The CLI and UI query the RESTful API to present data. NetQ can integrate with event notification applications and third-party analytics tools.
Data Center Network Deployments
This section describes three common data center deployment types for network management:
Out-of-band management (recommended)
In-band management
High availability
NetQ operates over layer 3, and can operate in both layer 2 bridged and layer 3 routed environments. NVIDIA recommends a layer 3 routed environment whenever possible.
Out-of-band Management Deployment
NVIDIA recommends deploying NetQ on an out-of-band (OOB) management network to separate network management traffic from standard network data traffic.
The physical network hardware includes:
Spine switches: aggregate and distribute data; also known as an aggregation switch, end-of-row (EOR) switch or distribution switch
Leaf switches: where servers connect to the network; also known as a top-of-rack (TOR) or access switch
Server hosts: host applications and data served to the user through the network
Exit switch: where connections to outside the data center occur; also known as Border Leaf or Service Leaf
Edge server (optional): where the firewall is the demarcation point, peering can occur through the exit switch layer to Internet (PE) devices
Internet device: where provider edge (PE) equipment communicates at layer 3 with the network fabric
The following figure shows an example of a Clos network fabric design for a data center using an OOB management network overlaid on top, where NetQ resides.The physical connections (shown as gray lines) between Spine 01 and four Leaf devices and two Exit devices, and Spine 02 and the same four Leaf devices and two Exit devices. Leaf 01 and Leaf 02 connect to each other over a peerlink and act as an MLAG pair for Server 01 and Server 02. Leaf 03 and Leaf 04 connect to each other over a peerlink and act as an MLAG pair for Server 03 and Server 04. The Edge connects to both Exit devices, and the Internet node connects to Exit 01.
Data Center Network Example
The physical management hardware includes:
OOB management switch: aggregation switch that connects to all network devices through communications with the NetQ Agent on each node
NetQ Platform: hosts the telemetry software, database and user interfaces
These switches connect to each physical network device through a virtual network overlay, shown with purple lines.
In-band Management Deployment
While not the preferred deployment method, you might choose to implement NetQ within your data network. In this scenario, there is no overlay and all traffic to and from the NetQ Agents and the NetQ Platform traverses the data paths along with your regular network traffic. The roles of the switches in the Clos network are the same, except that the NetQ Platform performs the aggregation function that the OOB management switch performed. If your network goes down, you might not have access to the NetQ Platform for troubleshooting.
High Availability Deployment
NetQ supports a high availability deployment for users who prefer a solution in which the collected data and processing provided by the NetQ Platform remains available through alternate equipment should the platform fail for any reason. In this configuration, three NetQ Platforms are deployed, with one as the master and two as workers (or replicas). Data from the NetQ Agents is sent to all three switches so that if the master NetQ Platform fails, one of the replicas automatically becomes the master and continues to store and provide the telemetry data. The following example is based on an OOB management configuration, and modified to support high availability for NetQ.
NetQ Operation
In either in-band or out-of-band deployments, NetQ offers networkwide configuration and device management, proactive monitoring capabilities, and performance diagnostics for complete management of your network.
The NetQ Agent
From a software perspective, a network switch has software associated with the hardware platform, the operating system, and communications. For data centers, the software on a network switch is similar to the diagram shown here.
The NetQ Agent interacts with the various components and software on switches and hosts and provides the gathered information to the NetQ Platform. You can view the data using the NetQ CLI or UI.
The NetQ Agent polls the user space applications for information about the performance of the various routing protocols and services that are running on the switch. Cumulus Linux supports BGP and OSPF routing protocols as well as static addressing through FRRouting (FRR). Cumulus Linux also supports LLDP and MSTP among other protocols, and a variety of services such as systemd and sensors. SONiC supports BGP and LLDP.
For hosts, the NetQ Agent also polls for performance of containers managed with Kubernetes. All of this information is used to provide the current health of the network and verify it is configured and operating correctly.
For example, if the NetQ Agent learns that an interface has gone down, a new BGP neighbor has been configured, or a container has moved, it provides that information to the NetQ Platform. That information can then be used to notify users of the operational state change through various channels. By default, data is logged in the database, but you can use the CLI (netq show events) or configure the Event Service in NetQ to send the information to a third-party notification application as well.
The NetQ Agent interacts with the Netlink communications between the Linux kernel and the user space, listening for changes to the network state, configurations, routes, and MAC addresses. NetQ uses this information to enable notifications about these changes so that network operators and administrators can respond quickly when changes are not expected or favorable.
For example, if a new route is added or a MAC address removed, the NetQ Agent records these changes and sends that information to the NetQ Platform. Based on the configuration of the Event Service, these changes can be sent to a variety of locations for end user response.
The NetQ Agent also interacts with the hardware platform to obtain performance information about various physical components, such as fans and power supplies, on the switch. Operational states and temperatures are measured and reported, along with cabling information to enable management of the hardware and cabling, and proactive maintenance.
For example, as thermal sensors in the switch indicate that it is becoming very warm, various levels of alarms are generated. These are then communicated through notifications according to the Event Service configuration.
The NetQ Platform
After the collected data is sent to and stored in the NetQ database, you can:
Validate configurations, identifying misconfigurations in your
current network, in the past, or prior to deployment,
Monitor communication paths throughout the network,
Notify users of issues and management information,
Anticipate impact of connectivity changes,
and so forth.
Validate Configurations
The NetQ CLI enables validation of your network health through two sets of commands: netq check and netq show. They extract the information from the Network Service component and Event service. The Network Service component is continually validating the connectivity and configuration of the devices and protocols running on the network. Using the netq check and netq show commands displays the status of the various components and services on a networkwide and complete software stack basis. For example, you can perform a networkwide check on all sessions of BGP with a single netq check bgp command. The command lists any devices that have misconfigurations or other operational errors in seconds. When errors or misconfigurations are present, using the netq show bgp command displays the BGP configuration on each device so that you can compare and contrast each device, looking for potential causes. netq check and netq show commands are available for numerous components and services as shown in the following table.
Component or Service
Check
Show
Component or Service
Check
Show
Agents
X
X
LLDP
X
BGP
X
X
MACs
X
CLAG (MLAG)
X
X
MTU
X
Events
X
NTP
X
X
EVPN
X
X
OSPF
X
X
Interfaces
X
X
Sensors
X
X
Inventory
X
Services
X
IPv4/v6
X
VLAN
X
X
Kubernetes
X
VXLAN
X
X
Monitor Communication Paths
The trace engine validates the available communication paths between two network devices. The corresponding netq trace command enables you to view all of the paths between the two devices and if there are any breaks in the paths. This example shows two successful paths between server12 and leaf11, all with an MTU of 9152. The first command shows the output in path by path tabular mode. The second command shows the same output as a tree.
cumulus@switch:~$ netq trace 10.0.0.13 from 10.0.0.21
Number of Paths: 2
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9152
Id Hop Hostname InPort InTun, RtrIf OutRtrIf, Tun OutPort
--- --- ----------- --------------- --------------- --------------- ---------------
1 1 server12 bond1.1002
2 leaf12 swp8 vlan1002 peerlink-1
3 leaf11 swp6 vlan1002 vlan1002
--- --- ----------- --------------- --------------- --------------- ---------------
2 1 server12 bond1.1002
2 leaf11 swp8 vlan1002
--- --- ----------- --------------- --------------- --------------- ---------------
cumulus@switch:~$ netq trace 10.0.0.13 from 10.0.0.21 pretty
Number of Paths: 2
Number of Paths with Errors: 0
Number of Paths with Warnings: 0
Path MTU: 9152
hostd-12 bond1.1002 -- swp8 leaf12 <vlan1002> peerlink-1 -- swp6 <vlan1002> leaf11 vlan1002
bond1.1002 -- swp8 leaf11 vlan1002
To better understand the output in greater detail:
Path 1 traverses the network from server12 out bond1.1002 into leaf12 interface swp8 out VLAN1002 peerlink-1 into VLAN1002 interface swp6 on leaf11
Path 2 traverses the network from server12 out bond1.1002 into VLAN1002 interface swp8 on leaf11
If the MTU does not match across the network, or any of the paths or parts of the paths have issues, that data appears in the summary at the top of the output and shown in red along the paths, giving you a starting point for troubleshooting.
View Historical State and Configuration
You can run all check, show and trace commands for the current status and for a prior point in time. For example, this is useful when you receive messages from the night before, but are not seeing any problems now. You can use the netq check command to look for configuration or operational issues around the time that NetQ timestamped the messages. Then use the netq show commands to see information about the configuration at that time of the device in question or if there were any changes in a given timeframe. Optionally, you can use the netq trace command to see what the connectivity looked like between any problematic nodes at that time. This example shows problems occurred on spine01, leaf04, and server03 last night. The network administrator received notifications and wants to investigate. Below the diagram are the commands to run to determine the cause of a BGP error on spine01. Note that the commands use the around option to see the results for last night and that you can run them from any switch in the network.
cumulus@switch:~$ netq check bgp around 30m
Total Nodes: 25, Failed Nodes: 3, Total Sessions: 220 , Failed Sessions: 24,
Hostname VRF Peer Name Peer Hostname Reason Last Changed
----------------- --------------- ----------------- ----------------- --------------------------------------------- -------------------------
exit-1 DataVrf1080 swp6.2 firewall-1 BGP session with peer firewall-1 swp6.2: AFI/ 1d:2h:6m:21s
SAFI evpn not activated on peer
exit-1 DataVrf1080 swp7.2 firewall-2 BGP session with peer firewall-2 (swp7.2 vrf 1d:1h:59m:43s
DataVrf1080) failed,
reason: Peer not configured
exit-1 DataVrf1081 swp6.3 firewall-1 BGP session with peer firewall-1 swp6.3: AFI/ 1d:2h:6m:21s
SAFI evpn not activated on peer
exit-1 DataVrf1081 swp7.3 firewall-2 BGP session with peer firewall-2 (swp7.3 vrf 1d:1h:59m:43s
DataVrf1081) failed,
reason: Peer not configured
exit-1 DataVrf1082 swp6.4 firewall-1 BGP session with peer firewall-1 swp6.4: AFI/ 1d:2h:6m:21s
SAFI evpn not activated on peer
exit-1 DataVrf1082 swp7.4 firewall-2 BGP session with peer firewall-2 (swp7.4 vrf 1d:1h:59m:43s
DataVrf1082) failed,
reason: Peer not configured
exit-1 default swp6 firewall-1 BGP session with peer firewall-1 swp6: AFI/SA 1d:2h:6m:21s
FI evpn not activated on peer
exit-1 default swp7 firewall-2 BGP session with peer firewall-2 (swp7 vrf de 1d:1h:59m:43s
...
cumulus@switch:~$ netq exit-1 show bgp
Matching bgp records:
Hostname Neighbor VRF ASN Peer ASN PfxRx Last Changed
----------------- ---------------------------- --------------- ---------- ---------- ------------ -------------------------
exit-1 swp3(spine-1) default 655537 655435 27/24/412 Fri Feb 15 17:20:00 2019
exit-1 swp3.2(spine-1) DataVrf1080 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp3.3(spine-1) DataVrf1081 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp3.4(spine-1) DataVrf1082 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp4(spine-2) default 655537 655435 27/24/412 Fri Feb 15 17:20:00 2019
exit-1 swp4.2(spine-2) DataVrf1080 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp4.3(spine-2) DataVrf1081 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp4.4(spine-2) DataVrf1082 655537 655435 13/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp5(spine-3) default 655537 655435 28/24/412 Fri Feb 15 17:20:00 2019
exit-1 swp5.2(spine-3) DataVrf1080 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp5.3(spine-3) DataVrf1081 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp5.4(spine-3) DataVrf1082 655537 655435 14/12/0 Fri Feb 15 17:20:00 2019
exit-1 swp6(firewall-1) default 655537 655539 73/69/- Fri Feb 15 17:22:10 2019
exit-1 swp6.2(firewall-1) DataVrf1080 655537 655539 73/69/- Fri Feb 15 17:22:10 2019
exit-1 swp6.3(firewall-1) DataVrf1081 655537 655539 73/69/- Fri Feb 15 17:22:10 2019
exit-1 swp6.4(firewall-1) DataVrf1082 655537 655539 73/69/- Fri Feb 15 17:22:10 2019
exit-1 swp7 default 655537 - NotEstd Fri Feb 15 17:28:48 2019
exit-1 swp7.2 DataVrf1080 655537 - NotEstd Fri Feb 15 17:28:48 2019
exit-1 swp7.3 DataVrf1081 655537 - NotEstd Fri Feb 15 17:28:48 2019
exit-1 swp7.4 DataVrf1082 655537 - NotEstd Fri Feb 15 17:28:48 2019
Manage Network Events
The NetQ notifier manages the events that occur for the devices and components, protocols and services that it receives from the NetQ Agents. The notifier enables you to capture and filter events that occur to manage the behavior of your network. This is especially useful when an interface or routing protocol goes down and you want to get them back up and running as quickly as possible, preferably before anyone notices or complains. You can improve resolution time significantly by creating filters that focus on topics appropriate for a particular group of users. You can easily create filters around events related to BGP and MLAG session states, interfaces, links, NTP and other services, fans, power supplies, and physical sensor measurements.
For example, for operators responsible for routing, you can create an integration with a notification application that notifies them of routing issues as they occur. This is an example of a Slack message received on a netq-notifier channel indicating that the BGP session on switch leaf04 interface swp2 has gone down.
Timestamps in NetQ
Every event or entry in the NetQ database is stored with a timestamp of when the event was captured by the NetQ Agent on the switch or server. This timestamp is based on the switch or server time where the NetQ Agent is running, and is pushed in UTC format. It is important to ensure that all devices are NTP synchronized to prevent events from being displayed out of order or not displayed at all when looking for events that occurred at a particular time or within a time window.
Interface state, IP addresses, routes, ARP/ND table (IP neighbor) entries and MAC table entries carry a timestamp that represents the time the event happened (such as when a route is deleted or an interface comes up) - except the first time the NetQ agent is run. If the network has been running and stable when a NetQ agent is brought up for the first time, then this time reflects when the agent was started. Subsequent changes to these objects are captured with an accurate time of when the event happened.
Data that is captured and saved based on polling, and just about all other data in the NetQ database, including control plane state (such as BGP or MLAG), has a timestamp of when the information was captured rather than when the event actually happened, though NetQ compensates for this if the data extracted provides additional information to compute a more precise time of the event. For example, BGP uptime can be used to determine when the event actually happened in conjunction with the timestamp.
When retrieving the timestamp, command outputs display the time in three ways:
For non-JSON output when the timestamp represents the Last Changed time, time is displayed in actual date and time when the time change occurred
For non-JSON output when the timestamp represents an Uptime, time is displayed as days, hours, minutes, and seconds from the current time
For JSON output, time is displayed in microseconds that have passed since the Epoch time (January 1, 1970 at 00:00:00 GMT)
This example shows the difference between the timestamp displays.
Restarting a NetQ Agent on a device does not update the timestamps for existing objects to reflect this new restart time. NetQ preserves their timestamps relative to the original start time of the Agent. A rare exception is if you reboot the device between the time it takes the Agent to stop and restart; in this case, the time is still relative to the start time of the Agent.
Exporting NetQ Data
You can export data from the NetQ Platform in a couple of ways:
Use the json option to output command results to JSON format for parsing in other applications
Use the UI to export data from the full screen cards
Example Using the CLI
You can check the state of BGP on your network with netq check bgp:
cumulus@leaf01:~$ netq check bgp
Total Nodes: 25, Failed Nodes: 3, Total Sessions: 220 , Failed Sessions: 24,
Hostname VRF Peer Name Peer Hostname Reason Last Changed
----------------- --------------- ----------------- ----------------- --------------------------------------------- -------------------------
exit01 DataVrf1080 swp6.2 firewall01 BGP session with peer firewall01 swp6.2: AFI/ Tue Feb 12 18:11:16 2019
SAFI evpn not activated on peer
exit01 DataVrf1080 swp7.2 firewall02 BGP session with peer firewall02 (swp7.2 vrf Tue Feb 12 18:11:27 2019
DataVrf1080) failed,
reason: Peer not configured
exit01 DataVrf1081 swp6.3 firewall01 BGP session with peer firewall01 swp6.3: AFI/ Tue Feb 12 18:11:16 2019
SAFI evpn not activated on peer
exit01 DataVrf1081 swp7.3 firewall02 BGP session with peer firewall02 (swp7.3 vrf Tue Feb 12 18:11:27 2019
DataVrf1081) failed,
reason: Peer not configured
...
When you show the output in JSON format, this same command looks like this:
cumulus@leaf01:~$ netq check bgp json
{
"failedNodes":[
{
"peerHostname":"firewall01",
"lastChanged":1549995080.0,
"hostname":"exit01",
"peerName":"swp6.2",
"reason":"BGP session with peer firewall01 swp6.2: AFI/SAFI evpn not activated on peer",
"vrf":"DataVrf1080"
},
{
"peerHostname":"firewall02",
"lastChanged":1549995449.7279999256,
"hostname":"exit01",
"peerName":"swp7.2",
"reason":"BGP session with peer firewall02 (swp7.2 vrf DataVrf1080) failed, reason: Peer not configured",
"vrf":"DataVrf1080"
},
{
"peerHostname":"firewall01",
"lastChanged":1549995080.0,
"hostname":"exit01",
"peerName":"swp6.3",
"reason":"BGP session with peer firewall01 swp6.3: AFI/SAFI evpn not activated on peer",
"vrf":"DataVrf1081"
},
{
"peerHostname":"firewall02",
"lastChanged":1549995449.7349998951,
"hostname":"exit01",
"peerName":"swp7.3",
"reason":"BGP session with peer firewall02 (swp7.3 vrf DataVrf1081) failed, reason: Peer not configured",
"vrf":"DataVrf1081"
},
...
],
"summary": {
"checkedNodeCount": 25,
"failedSessionCount": 24,
"failedNodeCount": 3,
"totalSessionCount": 220
}
}
Example Using the UI
Open the full screen Switch Inventory card, select the data to export, and click Export.
Important File Locations
To aid in troubleshooting issues with NetQ, the following configuration and log files can provide insight into root causes of issues:
File
Description
/etc/netq/netq.yml
The NetQ configuration file. This file appears only if you installed either the netq-apps package or the NetQ Agent on the system.
/var/log/netqd.log
The NetQ daemon log file for the NetQ CLI. This log file appears only if you installed the netq-apps package on the system.
/var/log/netq-agent.log
The NetQ Agent log file. This log file appears only if you installed the NetQ Agent on the system.
Firewall and Port Requirements
You must open the following ports on your NetQ Platform:
Port or Protocol Number
Protocol
Component Access
8443
TCP
Admin UI
443
TCP
NetQ UI
31980
TCP
NetQ Agent communication
31982
TCP
NetQ Agent SSL communication
32708
TCP
API Gateway
22
TCP
SSH
179
TCP
Calico networking (BGP)
4
IP Protocol
Calico networking (IP-in-IP Protocol)
4789
UDP
Calico networking (VxLAN)
6443
TCP
kube-apiserver
2379
TCP
etcd datastore
Port 32666 is no longer used for the NetQ UI.
NetQ User Interface Overview
The NetQ user interface (UI) lets you access NetQ through a web browser instead of a terminal window with a command line interface (CLI). With the UI, you can visualize network health, inventory, and system events, making it easy to find errors and misconfigurations, and to fix them.
You can access the UI from both the on-premises and cloud deployments. NetQ supports Google Chrome and Mozilla Firefox. Other browsers may have problems loading NetQ.
Before you get started, refer to the release notes for this version.
Access the NetQ UI
This page describes how to sign in and out of NetQ, and how to reset your password.
Log In to NetQ
To log in to the UI:
Open a new Chrome browser window or tab.
Enter the following URL into the address bar:
NetQ On-premises Appliance or VM: https://<hostname-or-ipaddress>:443
The following are the default usernames and passwords for UI access:
NetQ On-premises: admin, admin
NetQ Cloud: Use the credentials you created during setup. You should receive an email from NVIDIA titled NetQ Access Link.
Use your username and password to log in. You can also log in with SSO if your company has enabled it.
Username and Password
Locate the email you received from NVIDIA titled NetQ Access Link. Select Create Password.
Enter a new password. Then enter it again to confirm it.
Log in using your email address and new password.
Accept the Terms of Use after reading them.
The default NetQ Workbench opens, with your username and premise shown in the upper right corner of the application.
SSO
Follow the steps above until you reach the NetQ login screen.
Select Sign up for SSO and enter your organization’s name.
Enter your username and password.
Create a new password and enter the new password again to confirm it.
Click Update and Accept after reading the Terms of Use.
The default NetQ Workbench opens, with your username shown in the upper right corner of the application.
Enter your username.
Enter your password.
The user-specified home workbench is displayed. If a home workbench is not specified, then the default workbench is displayed.
Any workbench can be set as the home workbench. Click (User Settings), click Profiles and Preferences, then on the Workbenches card click to the left of the workbench name you want to be your home workbench.
Click Forgot Password? and enter an email address. Look for a message with the subject NetQ Password Reset Link from netq-sre@cumulusnetworks.com.
Select the link in the email and follow the instructions to create a new password.
Log Out of NetQ
To log out of the NetQ UI:
Select at the top right of the application.
Select Log Out.
Application Layout
The NetQ UI contains two main areas:
Application Header (1): Contains the main menu, NetQ version, recent actions history, search capabilities, quick health status chart, local time zone, premises list, and user account information.
Workbench (2): Contains a task bar and content cards (with status and configuration information about your network and its various components).
Main Menu
Found in the application header, click to open the main menu which provides navigation to:
Header
Menu
Search: a search bar to quickly find an item on the main menu
Favorites: contains link to the user-defined favorite workbenches; Home points to the NetQ Workbench until reset by a user
Workbenches: contains links to all workbenches
Network: contains links to tabular data about various network elements and the What Just Happened feature
Notifications: contains link to threshold-based event rules and notification channel specifications
Admin: contains links to application management and lifecycle management features (only visible to users with Admin access role)
Search
The Global Search field in the UI header enables you to search for devices and cards. It behaves like most searches and can help you quickly find device information.
NVIDIA Logo
Clicking the NVIDIA logo takes you to your favorite workbench. For details about specifying your favorite workbench, refer to Set User Preferences.
Validation Summary View
Found in the header, the chart provides a view into the health of your network at a glance.
On initial start up of the application, it can take up to an hour to reach an accurate health indication as some processes only run every 30 minutes.
Workbenches
A workbench comprises a given set of cards. A pre-configured default workbench, NetQ Workbench, is available to get you started. You can create your own workbenches and add or remove cards to meet your particular needs. For more detail about managing your data using workbenches, refer to Focus Your Monitoring Using Workbenches.
Cards
Cards present information about your network for monitoring and troubleshooting. This is where you can expect to spend most of your time. Each card describes a particular aspect of the network. Cards are available in multiple sizes, from small to full screen. The level of the content on a card varies in accordance with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen view. Cards are collected onto a workbench where you see all of the data relevant to a task or set of tasks. You can add and remove cards from a workbench, move between cards and card sizes, and make copies of cards to show different levels of data at the same time. For details about working with cards, refer to Access Data with Cards.
User Settings
Each user can customize the NetQ application display, time zone and date format; change their account password; and manage their workbenches. This is all performed from User Settings > Profile & Preferences. For details, refer to Set User Preferences.
Focus Your Monitoring Using Workbenches
Workbenches are where you collect and view the data that is important to you.
Two types of workbenches are available:
Default: Provided by NVIDIA; you cannot save changes you make to these workbenches
Custom: Created by the user; changes made to these workbenches are saved automatically
Both types of workbenches display a set of cards. Default workbenches are public (accessible to all users), whereas custom workbenches are private (viewing is restricted to the user who created them).
Default Workbenches
The default workbench contains Device Inventory, Switch Inventory, Events, and Validation Summary cards, giving you a high-level view of how your network is operating.
On initial login, the NetQ Workbench opens. On subsequent logins, the last workbench you used opens.
Custom Workbenches
Users with either administrative or user roles can create and save as many custom workbenches as suits their needs. For example, a user might create a workbench that:
Shows all of the selected cards for the past week and one that shows all of the selected cards for the past 24 hours
Only has data about your virtual overlays; EVPN plus events cards
Has selected switches that you are troubleshooting
Is focused on application or user account management
Create a Workbench
To create a workbench:
Select New in the workbench header.
Enter a name for the workbench and choose if you would like to set this as your new default home workbench.
Select the cards you would like to display on your new workbench.
Click Create to create your new workbench.
Refer to Access Data with Cards for information about interacting with cards on your workbenches.
Clone a Workbench
To create a duplicate clone based on an existing workbench:
Select Clone in the workbench header.
Name the cloned workbench and select Clone.
Remove a Workbench
Administrative users can remove any workbench, except for the default NetQ Workbench. Users with a user role can only remove workbenches they have created.
To remove a workbench:
Select in the application header to open the User Settings options.
Click Profile & Preferences.
Locate the Workbenches card.
Hover over the workbench you want to remove, and click Delete.
Open an Existing Workbench
There are several options for opening workbenches:
Open through the Workbench header
Click next to the current workbench name and locate the workbench
Under My Home, click the name of your favorite workbench
Under My Most Recent, click the workbench if in list
Search by workbench name
Click All My WB to open all workbenches and select it from the list
Open through the main menu
Expand the menu and select the workbench from the Favorites or Workbenches sections
Open through the NVIDIA logo
Click the logo in the header to open your favorite workbench
Manage Auto-refresh for Your Workbenches
You can specify how often to update the data displayed on your workbenches. Three refresh rates are available:
Analyze: updates every 30 seconds
Debug: updates every minute
Monitor: updates every two (2) minutes
By default, auto-refresh is enabled and configured to update every 30 seconds.
Disable/Enable Auto-refresh
To disable or pause auto-refresh of your workbenches, select Refresh in the workbench header. This toggles between the two states, Running and Paused, where indicates it is currently disabled and indicates it is currently enabled.
While having the workbenches update regularly is good most of the time, you might find that you want to pause the auto-refresh feature when you are troubleshooting and you do not want the data to change on a given set of cards temporarily. In this case, you can disable the auto-refresh and then enable it again when you are finished.
View Current Settings
To view the current auto-refresh rate and operational status, hover over Refresh in the workbench header. A tooltip displays the settings:
Change Settings
To modify the auto-refresh setting:
Select the dropdown next to Refresh.
Select the refresh rate. A check mark is shown next to the current selection. The new refresh rate is applied immediately.
Manage Workbenches
To manage your workbenches as a group, either:
Click next to the current workbench name, then click Manage My WB.
Click , select Profiles & Preferences.
Both of these open the Profiles & Preferences page. Look for the Workbenches card and refer to Manage Your Workbenches for more information.
Access Data with Cards
Cards present information about your network for monitoring and troubleshooting; each card describes a particular aspect of the network. Cards are collected onto a workbench where all data relevant to a task or set of tasks is visible. You can add and remove cards from a workbench, increase or decrease their sizes, change the time period of the data shown on a card, and make copies of cards to show different levels of data at the same time.
Card Sizes
Cards are available in multiple sizes, from small to full screen. The level of the content on a card varies with the size of the card, with the highest level of information on the smallest card to the most detailed information on the full-screen card.
Card Size Summary
Card Size
Small
Medium
Large
Full Screen
Primary Purpose
Quick view of status, typically at the level of good or bad
Enable quick actions, run a validation or trace for example
View key performance parameters or statistics
Perform an action
Look for potential issues
View detailed performance and statistics
Perform actions
Compare and review related information
View all attributes for given network aspect
Free-form data analysis and visualization
Export data to third-party tools
Small Cards
Small cards provide an overview of the performance or statistical value of a given aspect of your network. They typically include an icon to identify the aspect being monitored, summary performance or statistics in the form of a graph or counts, and an indication of any related events.
Medium Cards
Medium cards provide the key measurements for a given aspect of your network. They include the same content as the small cards with additional, relevant information, such as related events or components.
Large Cards
Large cards provide detailed information for monitoring specific components or functions of a given aspect of your network. This granular view can aid in isolating and resolving existing issues or preventing potential issues. These cards frequently display statistics or graphs that help visualize data.
Full-Screen Cards
Full-screen cards show all available data about an aspect of your network. They typically display data in a tabular view that can be filtered and sorted. When relevant, they also display visualizations of that data.
Card Interactions
Each card focuses on a particular aspect of your network. They include:
Validation summary: networkwide view of network health
Events: information about all error and info events in the system
What Just Happened: information about network issues and packet drops
Device groups: information about the distribution of device components
Inventory|Devices: information about all switches and hosts in the network
Inventory|Switches: information about the components on a given switch
Inventory|DPU: information about data processing units
Inventory|Hosts: information about hosts
Trace request: find available paths between two devices in the network fabric
There are five additional network services cards for session monitoring, including BGP, MLAG, EVPN, OSPF, and LLDP.
Add Cards to Your Workbench
Follow the steps in this section to add cards to your workbench. To add individual switch cards, refer to Add Switch Cards to Your Workbench.
To add one or more cards:
Click in the header.
Locate the card you want to add to your workbench. Use the categories in the side navigation or Search to help narrow down your options.
Click on each card you want to add to your workbench.
When you have selected all of the cards you want to add to your workbench, you can confirm which cards have been selected by clicking the Cards Selected link. Modify your selection as needed.
Click Open Cards to add the selected cards, or Cancel to return to your workbench without adding any cards.
The cards are placed at the end of the set of cards currently on the workbench. You might need to scroll down to see them. You can drag and drop the cards on the workbench to rearrange them.
Add Switch Cards to Your Workbench
You can add switch cards to a workbench through the Switches icon on the header or by searching for it through Global Search.
To add a switch card using the icon:
Click , then select Open a device card.
Begin entering the hostname of the switch you want to monitor.
Select the device from the suggestions that appear.
If you attempt to enter a hostname that is unknown to NetQ, a red border appears around the entry field and you are unable to select Add. Try checking for spelling errors. If you feel your entry is valid, but not an available choice, consult with your network administrator.
Click Add to add the switch card to your workbench, or Cancel to return to your workbench without adding the switch card.
To open the switch card by searching:
Click in Global Search.
Begin typing the name of a switch.
Select it from the options that appear.
Remove Cards from Your Workbench
To remove all the cards from your workbench, click the Clear icon in the header. To remove an individual card:
Hover over the card you want to remove.
Click (More Actions menu).
Click Remove.
The card is removed from the workbench, but not from the application.
Change the Time Period for the Card Data
All cards have a default time period for the data shown on the card, typically the last 24 hours. You can change the time period to view the data during a different time range to aid analysis of previous or existing issues.
To change the time period for a card:
Hover over the card and select in the header.
Select a time period from the dropdown list.
Changing the time period in this manner only changes the time period for the given card.
Change the Size of the Card
To change the card size:
Hover over the card.
Hover over the size picker and move the cursor to the right or left until the desired size option is highlighted.
One-quarter width opens a small card. One-half width opens a medium card. Three-quarters width opens a large card. Full width opens a full-screen card.
Click the picker. The card changes to the selected size, and might move its location on the workbench.
Table Settings
You can manipulate the tabular data displayed in a full-screen card by filtering and sorting the columns. To reposition the columns, drag and drop them using your mouse. You can also export the data presented in the table.
The following icons are common in the full-screen card view.
Icon
Action
Description
Select All
Selects all items in the list.
Clear All
Clears all existing selections in the list.
Add Item
Adds item to the list.
Edit
Edits the selected item.
Delete
Removes the selected items.
Filter
Filters the list using available parameters.
,
Generate/Delete AuthKeys
Creates or removes NetQ CLI authorization keys.
Open Cards
Opens the corresponding validation or trace card(s).
Assign role
Opens role assignment options for switches.
Export
Exports selected data into either a .csv or JSON-formatted file.
When there are numerous items in a table, NetQ loads up to 25 by default and provides the rest in additional table pages. Pagination is displayed under the table.
Set User Preferences
Each user can customize the NetQ application display, change their account password, and manage their workbenches.
Configure Display Settings
The Display card contains the options for setting the application theme (light or dark), language, time zone, and date formats.
To configure the display settings:
Click in the application header to open the User Settings options.
Click Profile & Preferences.
Locate the Display card.
In the Theme field, click to select either dark or light theme. The following figure shows the light theme.
In the Time Zone field, click to change the time zone from the default.
By default, the time zone is set to the user’s local time zone. If a time zone has not been selected, NetQ defaults to the current local time zone where NetQ is installed. All time values are based on this setting. This is displayed in the application header, and is based on Greenwich Mean Time (GMT). If your deployment is not local to you (for example, you want to view the data from the perspective of a data center in another time zone) you can change the display to a different time zone.
You can also change the time zone from the header display.
In the Date Format field, select the date and time format you want displayed on the cards.
Change Your Password
You can change your account password at any time.
To change your password:
Click in the application header to open the User Settings options.
Click Profile & Preferences.
In the Basic Account Info card, select Change Password.
Enter your current password, followed by your new password.
Click Save to change to the new password.
Manage Your Workbenches
A workbench is similar to a dashboard. This is where you collect and view the data that is important to you. You can have more than one workbench and manage them with the Workbenches card located in Profile & Preferences. From the Workbenches card, you can view, sort, and delete workbenches. For a detailed overview of workbenches, see Focus Your Monitoring Using Workbenches.
NetQ Command Line Overview
The NetQ CLI provides access to all network state and event information collected by the NetQ Agents. It behaves the same way most CLIs behave, with groups of commands used to display related information, the ability to use TAB completion when entering commands, and to get help for given commands and options. There are four categories of commands: check, show, config, and trace.
The NetQ command line interface only runs on switches and server hosts implemented with Intel x86 or ARM-based architectures. If you are unsure what architecture your switch or server employs, check the Hardware Compatibility List and verify the value in the Platforms tab > CPU column.
CLI Access
When you install or upgrade NetQ, you can also install and enable the CLI on your NetQ server or appliance and hosts. Refer to the Install NetQ topic for details.
To access the CLI from a switch or server:
Log in to the device. This example uses the default username of cumulus and a hostname of switch.
<computer>:~<username>$ ssh cumulus@switch
Enter your password to reach the command prompt. The default password is CumulusLinux! For example:
Enter passphrase for key '/Users/<username>/.ssh/id_rsa': <enter CumulusLinux! here>
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-112-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
Last login: Tue Sep 15 09:28:12 2019 from 10.0.0.14
cumulus@switch:~$
Run commands. For example:
cumulus@switch:~$ netq show agents
cumulus@switch:~$ netq check bgp
Command Line Basics
This section describes the core structure and behavior of the NetQ CLI. It includes the following:
The NetQ command line has a flat structure as opposed to a modal structure. Thus, you can run all commands from the standard command prompt instead of only in a specific mode, at the same level.
Command Syntax
NetQ CLI commands all begin with netq. NetQ commands fall into one of four syntax categories: validation (check), monitoring (show), configuration, and trace.
netq check <network-protocol-or-service> [options]
netq show <network-protocol-or-service> [options]
netq config <action> <object> [options]
netq trace <destination> from <source> [options]
Symbols
Meaning
Parentheses ( )
Grouping of required parameters. Choose one.
Square brackets [ ]
Single or group of optional parameters. If more than one object or keyword is available, choose one.
Angle brackets < >
Required variable. Value for a keyword or option; enter according to your deployment nomenclature.
Pipe |
Separates object and keyword options, also separates value options; enter one object or keyword and zero or one value.
For example, in the netq check command:
[<hostname>] is an optional parameter with a variable value named hostname
<network-protocol-or-service> represents a number of possible key words, such as agents, bgp, evpn, and so forth
<options> represents a number of possible conditions for the given object, such as around, vrf, or json
Thus some valid commands are:
netq leaf02 check agents json
netq show bgp
netq config restart cli
netq trace 10.0.0.5 from 10.0.0.35
Command Output
The command output presents results in color for many commands. Results with errors appear in red, and warnings appear in yellow. Results without errors or warnings appear in either black or green. VTEPs appear in blue. A node in the pretty output appears in bold, and angle brackets (< >) wrap around a router interface. To view the output with only black text, run the netq config del color command. You can view output with colors again by running netq config add color.
All check and show commands have a default timeframe of now to one hour ago, unless you specify an approximate time using the around keyword or a range using the between keyword. For example, running netq check bgp shows the status of BGP over the last hour. Running netq show bgp around 3h shows the status of BGP three hours ago.
Command Prompts
NetQ code examples use the following prompts:
cumulus@switch:~$ Indicates the user cumulus is logged in to a switch to run the example command
cumulus@host:~$ Indicates the user cumulus is logged in to a host to run the example command
cumulus@netq-appliance:~$ Indicates the user cumulus is logged in to either the NetQ Appliance or NetQ Cloud Appliance to run the command
cumulus@hostname:~$ Indicates the user cumulus is logged in to a switch, host or appliance to run the example command
To use the NetQ CLI, the switches must be running the Cumulus Linux or SONiC operating system (OS), NetQ Platform or NetQ Collector software, the NetQ Agent, and the NetQ CLI. The hosts must be running CentOS, RHEL, or Ubuntu OS, the NetQ Agent, and the NetQ CLI. Refer to the Install NetQ topic for details.
Command Completion
As you enter commands, you can get help with the valid keywords or options using the Tab key. For example, using Tab completion with netq check displays the possible objects for the command, and returns you to the command prompt to complete the command.
cumulus@switch:~$ netq check <<press Tab>>
agents : Netq agent
bgp : BGP info
cl-version : Cumulus Linux version
clag : Cumulus Multi-chassis LAG
evpn : EVPN
interfaces : network interface port
mlag : Multi-chassis LAG (alias of clag)
mtu : Link MTU
ntp : NTP
ospf : OSPF info
sensors : Temperature/Fan/PSU sensors
vlan : VLAN
vxlan : VXLAN data path
cumulus@switch:~$ netq check
Command Help
As you enter commands, you can get help with command syntax by entering help at various points within a command entry. For example, to find out what options are available for a BGP check, enter help after entering some of the netq check command. In this example, you can see that there are no additional required parameters and you can use three optional parameters — hostnames, vrf and around — with a BGP check.
The CLI stores commands issued within a session, which enables you to review and rerun commands that you already ran. At the command prompt, press the Up Arrow and Down Arrow keys to move back and forth through the list of commands previously entered. When you have found a given command, you can run the command by pressing Enter, just as you would if you had entered it manually. Optionally you can modify the command before you run it.
Command Categories
While the CLI has a flat structure, the commands can be conceptually grouped into these functional categories:
The netqcheck commands enable the network administrator to validate the current or historical state of the network by looking for errors and misconfigurations in the network. The commands run fabric-wide validations against various configured protocols and services to determine how well the network is operating. You can perform validation checks for the following:
agents: NetQ Agents operation on all switches and hosts
bgp: BGP (Border Gateway Protocol) operation across the network
fabric
clag: Cumulus Linux MLAG (multi-chassis LAG/link aggregation) operation
mtu: Link MTU (maximum transmission unit) consistency across paths
ntp: NTP (Network Time Protocol) operation
ospf: OSPF (Open Shortest Path First) operation
sensors: Temperature/Fan/PSU sensor operation
vlan: VLAN (Virtual Local Area Network) operation
vxlan: VXLAN (Virtual Extensible LAN) data path operation
The commands take the form of netq check <network-protocol-or-service> [options], where the options vary according to the protocol or service.
This example shows the output for the netq check bgp command, followed by the same command using the json option. If there were any failures, they would appear below the summary results or in the failedNodes section, respectively.
cumulus@switch:~$ netq check bgp
bgp check result summary:
Checked nodes : 8
Total nodes : 8
Rotten nodes : 0
Failed nodes : 0
Warning nodes : 0
Additional summary:
Total Sessions : 30
Failed Sessions : 0
Session Establishment Test : passed
Address Families Test : passed
Router ID Test : passed
The netq show commands enable the network administrator to view details about the current or historical configuration and status of the various protocols or services. You can view the configuration and status for the following:
address-history: Address history info for a IP address / prefix
agents: NetQ Agents status on switches and hosts
bgp: BGP status across the network fabric
cl-btrfs-info: BTRFS file system data for monitored Cumulus Linux switches
cl-manifest: Information about the versions of Cumulus Linux available on monitored switches
cl-pkg-info: Information about software packages installed on monitored switches
cl-resource: ACL and forwarding information
cl-ssd-util: SSD utilization information
clag: CLAG/MLAG status
dom: Digital Optical Monitoring
ethtool-stats: Interface statistics
events: Display changes over time
events-config: Events configured for suppression
evpn: EVPN status
interface-stats: Interface statistics
interface-utilization: Interface statistics plus utilization
interfaces: network interface port status
inventory: hardware component information
ip: IPv4 status
ipv6: IPv6 status
job-status: status of upgrade jobs running on the appliance or VM
kubernetes: Kubernetes cluster, daemon, pod, node, service and replication status
lldp: LLDP status
mac-commentary: MAC commentary info for a MAC address
mac-history: Historical information for a MAC address
macs: MAC table or address information
mlag: MLAG status (an alias for CLAG)
neighbor-history: Neighbor history info for an IP address
notification: Send notifications to Slack or PagerDuty
ntp: NTP status
opta-health: Display health of apps on the OPTA
opta-platform: NetQ Appliance version information and uptime
ospf: OSPF status
recommended-pkg-version: Current host information to be considered
resource-util: Display usage of memory, CPU and disk resources
roce-config: Display RoCE configuration
roce-counters: Displays RDMA over Converged Ethernet counters for a given switch
sensors: Temperature/Fan/PSU sensor status
services: System services status
tca: Threshold crossing alerts
trace: Control plane trace path across fabric
unit-tests: Show list of unit tests for netq check
validation: Schedule a validation check
vlan: VLAN status
vxlan: VXLAN data path status
wjh-drop: dropped packet data from NVIDIA® Mellanox® What Just Happened®
The commands take the form of netq [<hostname>] show <network-protocol-or-service> [options], where the options vary according to the protocol or service. You can restrict the commands from showing the information for all devices to showing information only for a selected device using the hostname option.
The following examples show the standard and filtered output for the netq show agents command.
cumulus@switch:~$ netq leaf01 show agents
Matching agents records:
Hostname Status NTP Sync Version Sys Uptime Agent Uptime Reinitialize Time Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
leaf01 Fresh yes 3.2.0-cl4u30~1601410518.104fb9ed Mon Sep 21 16:49:04 2020 Tue Sep 29 21:24:49 2020 Tue Sep 29 21:24:49 2020 Thu Oct 1 16:26:33 2020
Configuration Commands
Various commands, including netq config, netq notification, and netq install enable the network administrator to manage NetQ Agent and CLI server configuration, configure lifecycle management, set up container monitoring, and manage notifications.
NetQ Agent Configuration
The agent commands enable the network administrator to configure individual NetQ Agents. Refer to NetQ Components for a description of NetQ Agents, to Manage NetQ Agents, or to Install NetQ Agents for more detailed usage examples.
The agent configuration commands enable you to add and remove agents from switches and hosts, start and stop agent operations, debug the agent, specify default commands, and enable or disable a variety of monitoring features (including Kubernetes, sensors, FRR (FRRouting), CPU usage limit, and What Just Happened).
Commands apply to one agent at a time; you run them from the switch or host where the NetQ Agent resides.
This example shows how to view the NetQ Agent configuration:
cumulus@switch:~$ netq config show agent
netq-agent value default
--------------------- --------- ---------
enable-opta-discovery True True
exhibitport
agenturl
server 127.0.0.1 127.0.0.1
exhibiturl
vrf default default
agentport 8981 8981
port 31980 31980
After making configuration changes to your agents, you must restart the agent for the changes to take effect. Use the netq config restart agent command.
CLI Configuration
The netq config cli commands enable the network administrator to configure and manage the CLI component. These commands enable you to add or remove CLI (essentially enabling/disabling the service), start and restart it, and view the configuration of the service.
Commands apply to one device at a time, and you run them from the switch or host where you run the CLI.
The CLI configuration commands include:
netq config add cli server
netq config del cli server
netq config show cli premises [json]
netq config show (cli|all) [json]
netq config (status|restart) cli
netq config select cli premise
This example shows how to restart the CLI instance:
cumulus@switch~:$ netq config restart cli
This example shows how to enable the CLI on a NetQ On-premises appliance or virtual machine (VM):
cumulus@switch~:$ netq config add cli server 10.1.3.101
This example shows how to enable the CLI on a NetQ Cloud Appliance or VM for the Chicago premises and the default port:
netq config add cli server api.netq.cumulusnetworks.com access-key <user-access-key> secret-key <user-secret-key> premises chicago port 443
NetQ System Configuration Commands
You use the following commands to manage the NetQ system itself:
bootstrap: Loads the installation program onto the network switches and hosts in either a single server or server cluster arrangement.
decommission: Decommissions a switch or host.
install: Installs NetQ in standalone or cluster deployments; also used to install patch software.
upgrade bundle: Upgrades NetQ on NetQ On-premises Appliances or VMs.
This example shows how to bootstrap a single server or master server in a server cluster:
For information and examples on installing and upgrading the NetQ system, see Install NetQ and Upgrade NetQ.
Event Notification Commands
The notification configuration commands enable you to add, remove and show notification application integrations. These commands create the channels, filters, and rules needed to control event messaging. The commands include:
NetQ supports TCA events, a set of events that are triggered by crossing a user-defined threshold. You configure and manage TCA events using the following commands:
The netq lcm (lifecycle management) commands enable you to manage the deployment of NVIDIA product software onto your network devices (servers, appliances, and switches) in the most efficient way and with the most information about the process as possible. The LCM commands provide for:
Managing network OS and NetQ images in a local repository
Configuring switch access credentials for installations and upgrades
Managing switch inventory and roles
Upgrade NetQ (Agents and CLI) on switches with NetQ Agents
Install or upgrade NetQ Agents and CLI on switches with or without NetQ Agents all in a single job
Upgrade the network OS on switches with NetQ Agents
View a result history of upgrade attempts
This example shows the NetQ configuration profiles:
cumulus@switch:~$ netq lcm show netq-config
ID Name Default Profile VRF WJH CPU Limit Log Level Last Changed
------------------------- --------------- ------------------------------ --------------- --------- --------- --------- -------------------------
config_profile_3289efda36 NetQ default co Yes mgmt Disable Disable info Tue Apr 27 22:42:05 2021
db4065d56f91ebbd34a523b45 nfig
944fbfd10c5d75f9134d42023
eb2b
This example shows how to add a Cumulus Linux installation image to the NetQ repository on the switch:
The trace commands enable the network administrator to view the available paths between two nodes on the network currently and at a time in the past. You can perform a layer 2 or layer 3 trace, and view the output in one of three formats (json, pretty, and detail). JSON output provides the output in a JSON file format for ease of importing to other applications or software. Pretty output lines up the paths in a pseudo-graphical manner to help visualize multiple paths. Detail output is useful for traces with higher hop counts where the pretty output wraps lines, making it harder to interpret the results. The detail output displays a table with a row for each path.
This section describes how to install, configure, and upgrade NetQ.
Before you begin, review the release notes for this version.
Before You Install
This overview is designed to help you understand the various NetQ deployment and installation options.
Installation Overview
Consider the following before you install the NetQ system:
Determine whether to deploy the solution fully on premises or as a remote solution.
Decide whether to deploy a virtual machine on your own hardware or use one of the NetQ appliances.
Choose whether to install the software on a single server or as a server cluster.
The following decision tree reflects these steps:
Deployment Type: On Premises or Remote
You can deploy NetQ in one of two ways.
Hosted on premises: Choose this deployment if you want to host all required hardware and software at your location, and you have the in-house skill set to install, configure, and maintain it—including performing data backups, acquiring and maintaining hardware and software, and integration management. This model is also a good choice if you want very limited or no access to the internet from switches and hosts in your network or you have data residency requirements like GDPR.
Hosted remotely: Choose this deployment to host a multi-site, on-premises deployment or use the NetQ Cloud service. In the multi-site deployment, you host multiple small servers at each site and a large server and database at another site. In the cloud service deployment, you host only a small local server on your premises that connects to the NetQ Cloud service over selected ports or through a proxy server. The cloud service supports only data aggregation and forwarding locally, and the majority of the NetQ applications use a hosted deployment strategy, storing data in the cloud. NVIDIA handles the backups and maintenance of the application and storage. This remote cloud service model is often chosen when it is untenable to support deployment in-house or if you need the flexibility to scale quickly, while also reducing capital expenses.
With either deployment model, the NetQ Agents reside on the switches and hosts they monitor in your network.
System: Virtual Machine or NetQ Appliances
The next installation consideration is whether you plan to use NetQ Cloud Appliances or your own servers with VMs. Both options provide the same services and features. The difference is in the implementation. When you install NetQ software on your own hardware, you create and maintain a KVM or VMware VM, and the software runs from there. This requires you to scope and order an appropriate hardware server to support the NetQ requirements, but might allow you to reuse an existing server in your stock.
When you choose to purchase and install NetQ Cloud Appliances, the initial configuration of the server with Ubuntu OS is already done for you, and the NetQ software components are pre-loaded, saving you time during the physical deployment.
Data Flow
The flow of data differs based on your deployment model.
For the on-premises deployment, the NetQ Agents collect and transmit data from the switches and hosts back to the NetQ On-premises Appliance or virtual machine running the NetQ Platform software, which in turn processes and stores the data in its database. This data is then displayed through the user interface.
For the remote, multi-site NetQ implementation, the NetQ Agents at each premises collect and transmit data from the switches and hosts at that premises to its NetQ Cloud Appliance or virtual machine running the NetQ Collector software. The NetQ Collectors then transmit this data to the common NetQ Cloud Appliance or virtual machine and database at one of your premises for processing and storage.
For the remote, cloud-service implementation, the NetQ Agents collect and transmit data from the switches and hosts to the NetQ Cloud Appliance or virtual machine running the NetQ Collector software. The NetQ Collector then transmits this data to the NVIDIA cloud-based infrastructure for further processing and storage.
For either remote solution, telemetry data is displayed through the same user interfaces as the on-premises solution. When using the cloud service implementation of the remote solution, the browser interface can be pointed to the local NetQ Cloud Appliance or VM, or directly to netq.nvidia.com.
Server Arrangement: Single or Cluster
The next installation step is deciding whether to deploy a single server or a server cluster. Both options provide the same services and features. The biggest difference is the number of servers deployed and the continued availability of services running on those servers should hardware failures occur.
A single server is easier to set up, configure and manage, but can limit your ability to scale your network monitoring quickly. Deploying multiple servers is a bit more complicated, but you limit potential downtime and increase availability by having more than one server that can run the software and store the data. Select the standalone single-server arrangements for smaller, simpler deployments. Be sure to consider the capabilities and resources needed on this server to support the size of your final deployment.
Select the server cluster arrangement to obtain scalability and high availability for your network. The default clustering implementation has three servers: 1 master and 2 workers. However, NetQ supports up to 10 worker nodes in a cluster, and up to 5000 devices in total (switches, servers, and hosts). When you configure the cluster, configure the NetQ Agents to connect to these three nodes in the cluster first by providing the IP addresses as a comma-separated list. If you decide to add additional nodes to the cluster, you do not need to configure these nodes again.
Cluster Deployments and Kubernetes
NetQ also monitors Kubernetes containers. If the master node ever goes down, all NetQ services should continue to work. However, keep in mind that the master hosts the Kubernetes control plane so anything that requires connectivity with the Kubernetes cluster—such as upgrading NetQ or rescheduling pods to other workers if a worker goes down—will not work.
Cluster Deployments and Load Balancers
You need a load balancer for high availability for the NetQ API and the NetQ UI.
However, you need to be mindful of where you install the certificates for the NetQ UI (port 443); otherwise, you cannot access the NetQ UI.
If you are using a load balancer in your deployment, we recommend you install the certificates directly on the load balancer for SSL offloading. However, if you install the certificates on the master node, then configure the load balancer to allow for SSL passthrough.
Where to Go Next
After you’ve decided on your deployment type, you’re ready to install NetQ.
Install NetQ
The following sections provides installation instruction for the NetQ system and software. To install NetQ:
Set Up Your VMware Virtual Machine for a Single On-premises Server
Follow these steps to setup and configure your VM on a single server in an on-premises deployment:
Verify that your system meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Eight (8) virtual CPUs
Memory
64 GB RAM
Local disk storage
256 GB (2 TB max) SSD with minimum disk IOPS of 1000 for a standard 4kb block size (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
Network interface speed
1 Gb NIC
Hypervisor
VMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platform:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
VMware Example Configuration
This example shows the VM setup process using an OVA file with VMware ESXi.
Enter the address of the hardware in your browser.
Log in to VMware using credentials with root access.
Click Storage in the Navigator to verify you have an SSD installed.
Click Create/Register VM at the top of the right pane.
Select Deploy a virtual machine from an OVF or OVA file, and click Next.
Provide a name for the VM, for example NetQ.
Tip: Make note of the name used during install as this is needed in a later step.
Drag and drop the NetQ Platform image file you downloaded in Step 2 above.
Click Next.
Select the storage type and data store for the image to use, then click Next. In this example, only one is available.
Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.
Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.
The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.
Once completed, view the full details of the VM and hardware.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your VMware Virtual Machine for a Single Cloud Server
Follow these steps to setup and configure your VM for a cloud deployment:
Verify that your system meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Four (4) virtual CPUs
Memory
8 GB RAM
Local disk storage
For NetQ 3.2.x and later: 64 GB (2 TB max) For NetQ 3.1 and earlier: 32 GB (2 TB max)
Network interface speed
1 Gb NIC
Hypervisor
VMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platform:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
VMware Example Configuration
This example shows the VM setup process using an OVA file with VMware ESXi.
Enter the address of the hardware in your browser.
Log in to VMware using credentials with root access.
Click Storage in the Navigator to verify you have an SSD installed.
Click Create/Register VM at the top of the right pane.
Select Deploy a virtual machine from an OVF or OVA file, and click Next.
Provide a name for the VM, for example NetQ.
Tip: Make note of the name used during install as this is needed in a later step.
Drag and drop the NetQ Platform image file you downloaded in Step 2 above.
Click Next.
Select the storage type and data store for the image to use, then click Next. In this example, only one is available.
Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.
Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.
The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.
Once completed, view the full details of the VM and hardware.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your VMware Virtual Machine for an On-premises Server Cluster
First configure the VM on the master node, and then configure the VM on each worker node.
Follow these steps to setup and configure your VM cluster for an on-premises deployment:
Verify that your master node meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Eight (8) virtual CPUs
Memory
64 GB RAM
Local disk storage
256 GB (2 TB max) SSD with minimum disk IOPS of 1000 for a standard 4kb block size (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
Network interface speed
1 Gb NIC
Hypervisor
VMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platforms:
Port or Protocol Number
Protocol
Component Access
8443
TCP
Admin UI
443
TCP
NetQ UI
31980
TCP
NetQ Agent communication
31982
TCP
NetQ Agent SSL communication
32708
TCP
API Gateway
22
TCP
SSH
179
TCP
Calico networking (BGP)
4
IP Protocol
Calico networking (IP-in-IP Protocol)
4789
UDP
Calico networking (VxLAN)
6443
TCP
kube-apiserver
2379
TCP
etcd datastore
Additionally, for internal cluster communication, you must open these ports:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
VMware Example Configuration
This example shows the VM setup process using an OVA file with VMware ESXi.
Enter the address of the hardware in your browser.
Log in to VMware using credentials with root access.
Click Storage in the Navigator to verify you have an SSD installed.
Click Create/Register VM at the top of the right pane.
Select Deploy a virtual machine from an OVF or OVA file, and click Next.
Provide a name for the VM, for example NetQ.
Tip: Make note of the name used during install as this is needed in a later step.
Drag and drop the NetQ Platform image file you downloaded in Step 2 above.
Click Next.
Select the storage type and data store for the image to use, then click Next. In this example, only one is available.
Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.
Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.
The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.
Once completed, view the full details of the VM and hardware.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
Verify that your first worker node meets the VM requirements, as described in Step 1.
Confirm that the needed ports are open for communications, as described in Step 2.
Open your hypervisor and set up the VM in the same manner as for the master node.
Make a note of the private IP address you assign to the worker node. You need it for later installation steps.
Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Repeat Steps 8 through 11 for each additional worker node you want in your cluster.
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following commands on your master node, using the IP addresses of your worker nodes:
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your VMware Virtual Machine for a Cloud Server Cluster
First configure the VM on the master node, and then configure the VM on each worker node.
Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:
Verify that your master node meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Four (4) virtual CPUs
Memory
8 GB RAM
Local disk storage
For NetQ 3.2.x and later: 64 GB (2 TB max) For NetQ 3.1 and earlier: 32 GB (2 TB max)
Network interface speed
1 Gb NIC
Hypervisor
VMware ESXi™ 6.5 or later (OVA image) for servers running Cumulus Linux, CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platforms:
Port or Protocol Number
Protocol
Component Access
8443
TCP
Admin UI
443
TCP
NetQ UI
31980
TCP
NetQ Agent communication
31982
TCP
NetQ Agent SSL communication
32708
TCP
API Gateway
22
TCP
SSH
179
TCP
Calico networking (BGP)
4
IP Protocol
Calico networking (IP-in-IP Protocol)
4789
UDP
Calico networking (VxLAN)
6443
TCP
kube-apiserver
2379
TCP
etcd datastore
Additionally, for internal cluster communication, you must open these ports:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
VMware Example Configuration
This example shows the VM setup process using an OVA file with VMware ESXi.
Enter the address of the hardware in your browser.
Log in to VMware using credentials with root access.
Click Storage in the Navigator to verify you have an SSD installed.
Click Create/Register VM at the top of the right pane.
Select Deploy a virtual machine from an OVF or OVA file, and click Next.
Provide a name for the VM, for example NetQ.
Tip: Make note of the name used during install as this is needed in a later step.
Drag and drop the NetQ Platform image file you downloaded in Step 2 above.
Click Next.
Select the storage type and data store for the image to use, then click Next. In this example, only one is available.
Accept the default deployment options or modify them according to your network needs. Click Next when you are finished.
Review the configuration summary. Click Back to change any of the settings, or click Finish to continue with the creation of the VM.
The progress of the request is shown in the Recent Tasks window at the bottom of the application. This may take some time, so continue with your other work until the upload finishes.
Once completed, view the full details of the VM and hardware.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
Verify that your first worker node meets the VM requirements, as described in Step 1.
Confirm that the needed ports are open for communications, as described in Step 2.
Open your hypervisor and set up the VM in the same manner as for the master node.
Make a note of the private IP address you assign to the worker node. You need it for later installation steps.
Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Repeat Steps 8 through 11 for each additional worker node you want in your cluster.
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your KVM Virtual Machine for a Single On-premises Server
Follow these steps to set up and configure your VM on a single server in an on-premises deployment:
Verify that your system meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Eight (8) virtual CPUs
Memory
64 GB RAM
Local disk storage
256 GB (2 TB max) SSD with minimum disk IOPS of 1000 for a standard 4kb block size (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
Network interface speed
1 Gb NIC
Hypervisor
KVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
Confirm that the required ports are open for communications.
You must open the following ports on your NetQ Platform:
Copy the QCOW2 image to a directory where you want to run it.
Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.
Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.
Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:
Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.
Make note of the name used during install as this is needed in a later step.
Watch the boot process in another terminal window.
$ virsh console netq_ts
Log into the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your KVM Virtual Machine for a Single Cloud Server
Follow these steps to setup and configure your VM on a single server in a cloud deployment:
Verify that your system meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Four (4) virtual CPUs
Memory
8 GB RAM
Local disk storage
For NetQ 3.2.x and later: 64 GB (2 TB max) For NetQ 3.1 and earlier: 32 GB (2 TB max)
Network interface speed
1 Gb NIC
Hypervisor
KVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platform:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
KVM Example Configuration
This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the platform is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your KVM Virtual Machine for an On-premises Server Cluster
First configure the VM on the master node, and then configure the VM on each worker node.
Follow these steps to setup and configure your VM on a cluster of servers in an on-premises deployment:
Verify that your master node meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Eight (8) virtual CPUs
Memory
64 GB RAM
Local disk storage
256 GB (2 TB max) SSD with minimum disk IOPS of 1000 for a standard 4kb block size (Note: This must be an SSD; use of other storage options can lead to system instability and are not supported.)
Network interface speed
1 Gb NIC
Hypervisor
KVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platforms:
Port or Protocol Number
Protocol
Component Access
8443
TCP
Admin UI
443
TCP
NetQ UI
31980
TCP
NetQ Agent communication
31982
TCP
NetQ Agent SSL communication
32708
TCP
API Gateway
22
TCP
SSH
179
TCP
Calico networking (BGP)
4
IP Protocol
Calico networking (IP-in-IP Protocol)
4789
UDP
Calico networking (VxLAN)
6443
TCP
kube-apiserver
2379
TCP
etcd datastore
Additionally, for internal cluster communication, you must open these ports:
Copy the QCOW2 image to a directory where you want to run it.
Tip: Copy, instead of moving, the original QCOW2 image that was downloaded to avoid re-downloading it again later should you need to perform this process again.
Replace the disk path value with the location where the QCOW2 image is to reside. Replace network model value (eth0 in the above example) with the name of the interface where the VM is connected to the external network.
Or, for a Bridged VM, where the VM attaches to a bridge which has already been setup to allow for external access:
Replace network bridge value (br0 in the above example) with the name of the (pre-existing) bridge interface where the VM is connected to the external network.
Make note of the name used during install as this is needed in a later step.
Watch the boot process in another terminal window.
$ virsh console netq_ts
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
Verify that your first worker node meets the VM requirements, as described in Step 1.
Confirm that the needed ports are open for communications, as described in Step 2.
Open your hypervisor and set up the VM in the same manner as for the master node.
Make a note of the private IP address you assign to the worker node. You need it for later installation steps.
Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Repeat Steps 8 through 11 for each additional worker node you want in your cluster.
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following commands on your master node, using the IP addresses of your worker nodes:
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Set Up Your KVM Virtual Machine for a Cloud Server Cluster
First configure the VM on the master node, and then configure the VM on each worker node.
Follow these steps to setup and configure your VM on a cluster of servers in a cloud deployment:
Verify that your master node meets the VM requirements.
When using a VM, the following system resources must be allocated:
Resource
Minimum Requirement
Processor
Four (4) virtual CPUs
Memory
8 GB RAM
Local disk storage
For NetQ 3.2.x and later: 64 GB (2 TB max) For NetQ 3.1 and earlier: 32 GB (2 TB max)
Network interface speed
1 Gb NIC
Hypervisor
KVM/QCOW (QEMU Copy on Write) image for servers running CentOS, Ubuntu and RedHat operating systems
Confirm that the needed ports are open for communications.
You must open the following ports on your NetQ Platforms:
Port or Protocol Number
Protocol
Component Access
8443
TCP
Admin UI
443
TCP
NetQ UI
31980
TCP
NetQ Agent communication
31982
TCP
NetQ Agent SSL communication
32708
TCP
API Gateway
22
TCP
SSH
179
TCP
Calico networking (BGP)
4
IP Protocol
Calico networking (IP-in-IP Protocol)
4789
UDP
Calico networking (VxLAN)
6443
TCP
kube-apiserver
2379
TCP
etcd datastore
Additionally, for internal cluster communication, you must open these ports:
Open your hypervisor and set up your VM. You can use this example for reference or use your own hypervisor instructions.
KVM Example Configuration
This example shows the VM setup process for a system with Libvirt and KVM/QEMU installed.
Log in to the VM and change the password.
Use the default credentials to log in the first time:
Username: cumulus
Password: cumulus
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
You are required to change your password immediately (root enforced)
System information as of Thu Dec 3 21:35:42 UTC 2020
System load: 0.09 Processes: 120
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Connection to <ipaddr> closed.
Log in again with your new password.
$ ssh cumulus@<ipaddr>
Warning: Permanently added '<ipaddr>' (ECDSA) to the list of known hosts.
Ubuntu 18.04.5 LTS
cumulus@<ipaddr>'s password:
System information as of Thu Dec 3 21:35:59 UTC 2020
System load: 0.07 Processes: 121
Usage of /: 8.1% of 61.86GB Users logged in: 0
Memory usage: 5% IP address for eth0: <ipaddr>
Swap usage: 0%
Last login: Thu Dec 3 21:35:43 2020 from <local-ipaddr>
cumulus@ubuntu:~$
Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Change the hostname for the VM from the default value.
The default hostname for the NetQ Virtual Machines is ubuntu. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames are composed of a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels may contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
Add the same NEW_HOSTNAME value to /etc/hosts on your VM for the localhost entry. Example:
127.0.0.1 localhost NEW_HOSTNAME
Verify that your first worker node meets the VM requirements, as described in Step 1.
Confirm that the needed ports are open for communications, as described in Step 2.
Open your hypervisor and set up the VM in the same manner as for the master node.
Make a note of the private IP address you assign to the worker node. You need it for later installation steps.
Verify the worker node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Repeat Steps 8 through 11 for each additional worker node you want in your cluster.
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Install the NetQ On-premises Appliance
This topic describes how to prepare your single, NetQ On-premises Appliance for installation of the NetQ Platform software.
Each system shipped to you contains:
Your NVIDIA NetQ On-premises Appliance (a Supermicro 6019P-WTR server)
Hardware accessories, such as power cables and rack mounting gear (note that network cables and optics ship separately)
Information regarding your order
For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.
Install the Appliance
After you unbox the appliance:
Mount the appliance in the rack.
Connect it to power following the procedures described in your appliance's user manual.
Connect the Ethernet cable to the 1G management port (eno1).
Power on the appliance.
If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.
Configure the Password, Hostname and IP Address
Change the password and specify the hostname and IP address for the appliance before installing the NetQ software.
Log in to the appliance using the default login credentials:
Username: cumulus
Password: cumulus
Change the password using the passwd command:
cumulus@hostname:~$ passwd
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:
cumulus@hostname:~$ ip -4 -brief addr show eno1
eno1 UP 10.20.16.248/24
Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.
For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
addresses: [192.168.1.222/24]
gateway4: 192.168.1.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
Apply the settings.
cumulus@hostname:~$ sudo netplan apply
Verify NetQ Software and Appliance Readiness
Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.
Verify that the needed packages are present and of the correct release, version 4.3 and update 38.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify the installation images are present and of the correct release, version 4.3.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0.tgz
Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your NetQ platform server or NetQ Appliance:
cumulus@hostname:~$ netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.3.0.tgz
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ On-premises VM after this step, you need to re-register this address with NetQ as follows:
Reset the VM, indicating whether you want to purge any NetQ DB data or keep it.
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Install the NetQ Cloud Appliance
This topic describes how to prepare your single, NetQ Cloud Appliance for installation of the NetQ Collector software.
Each system shipped to you contains:
Your NVIDIA NetQ Cloud Appliance (a Supermicro SuperServer E300-9D)
Hardware accessories, such as power cables and rack mounting gear (note that network cables and optics ship separately)
Information regarding your order
If you’re looking for hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, check out the appliance’s user manual.
Install the Appliance
After you unbox the appliance:
Mount the appliance in the rack.
Connect it to power following the procedures described in your appliance's user manual.
Connect the Ethernet cable to the 1G management port (eno1).
Power on the appliance.
If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.
Configure the Password, Hostname and IP Address
Log in to the appliance using the default login credentials:
Username: cumulus
Password: cumulus
Change the password using the passwd command:
cumulus@hostname:~$ passwd
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:
cumulus@hostname:~$ ip -4 -brief addr show eno1
eno1 UP 10.20.16.248/24
Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.
For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
addresses: [192.168.1.222/24]
gateway4: 192.168.1.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
Apply the settings.
cumulus@hostname:~$ sudo netplan apply
Verify NetQ Software and Appliance Readiness
Now that the appliance is up and running, verify that the software is available and the appliance is ready for installation.
Verify that the required packages are present and reflect the most current version.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify the installation images are present and reflect the most current version.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0-opta.tgz
Verify the appliance is ready for installation. Fix any errors before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Install and activate the NetQ software using the the CLI:
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Install a NetQ On-premises Appliance Cluster
This topic describes how to prepare your cluster of NetQ On-premises Appliances for installation of the NetQ Platform software.
Each system shipped to you contains:
Your NVIDIA NetQ On-premises Appliance (a Supermicro 6019P-WTR server)
Hardware accessories, such as power cables and rack mounting gear (note that network cables and optics ship separately)
Information regarding your order
For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans, and accessories like included cables) or safety and environmental information, refer to the user manual and quick reference guide.
Install Each Appliance
After you unbox the appliance:
Mount the appliance in the rack.
Connect it to power following the procedures described in your appliance's user manual.
Connect the Ethernet cable to the 1G management port (eno1).
Power on the appliance.
If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.
Configure the Password, Hostname and IP Address
Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.
Log in to the appliance that you intend to use as your master node using the default login credentials:
Username: cumulus
Password: cumulus
Change the password using the passwd command:
cumulus@hostname:~$ passwd
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
The default hostname for the NetQ On-premises Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, “en.wikipedia.org” is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:
cumulus@hostname:~$ ip -4 -brief addr show eno1
eno1 UP 10.20.16.248/24
Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.
For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
addresses: [192.168.1.222/24]
gateway4: 192.168.1.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
Apply the settings.
cumulus@hostname:~$ sudo netplan apply
Repeat these steps for each of the worker node appliances.
Verify NetQ Software and Appliance Readiness
Now that the appliances are up and running, verify that the software is available and the appliance is ready for installation.
On the master node, verify that the needed packages are present and of the correct release, version 4.3.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify the installation images are present and of the correct release, version 4.3.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0.tgz
Verify the master node is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
On one or your worker nodes, verify that the needed packages are present and of the correct release, version 4.3 and update 38 or later.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Make a note of the private IP addresses you assign to the master and worker nodes. You need them for later installation steps.
Verify that the needed packages are present and of the correct release, version 4.3 and update 38.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify that the needed files are present and of the correct release.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0.tgz
Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check
Repeat Steps 4-9 for each additional worker node (NetQ On-premises Appliance).
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following commands on your master node, using the IP addresses of your worker nodes:
Re-run the install CLI on the appliance. This example uses interface eno1. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
cumulus@hostname:~$ netq install standalone full interface eno1 bundle /mnt/installables/NetQ-4.3.0.tgz
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Install a NetQ Cloud Appliance Cluster
This topic describes how to prepare your cluster of NetQ Cloud Appliances for installation of the NetQ Collector software.
Each system shipped to you contains:
Your NVIDIA NetQ Cloud Appliance (a Supermicro SuperServer E300-9D)
Hardware accessories, such as power cables and rack mounting gear (note that network cables and optics ship separately)
Information regarding your order
For more detail about hardware specifications (including LED layouts and FRUs like the power supply or fans and accessories like included cables) or safety and environmental information, refer to the user manual.
Install Each Appliance
After you unbox the appliance:
Mount the appliance in the rack.
Connect it to power following the procedures described in your appliance's user manual.
Connect the Ethernet cable to the 1G management port (eno1).
Power on the appliance.
If your network runs DHCP, you can configure NetQ over the network. If DHCP is not enabled, then you configure the appliance using the console cable provided.
Configure the Password, Hostname and IP Address
Change the password and specify the hostname and IP address for each appliance before installing the NetQ software.
Log in to the appliance that you intend to use as your master node using the default login credentials:
Username: cumulus
Password: cumulus
Change the password using the passwd command:
cumulus@hostname:~$ passwd
Changing password for cumulus.
(current) UNIX password: cumulus
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
The default hostname for the NetQ Cloud Appliance is netq-appliance. Change the hostname to fit your naming conventions while meeting Internet and Kubernetes naming standards.
Kubernetes requires that hostnames comprise a sequence of labels concatenated with dots. For example, en.wikipedia.org is a hostname. Each label must be from 1 to 63 characters long. The entire hostname, including the delimiting dots, has a maximum of 253 ASCII characters.
The Internet standards (RFCs) for protocols specify that labels can contain only the ASCII letters a through z (in lower case), the digits 0 through 9, and the hyphen-minus character ('-').
The appliance contains two Ethernet ports. It uses port eno1 for out-of-band management. This is where NetQ Agents should send the telemetry data collected from your monitored switches and hosts. By default, eno1 uses DHCPv4 to get its IP address. You can view the assigned IP address using the following command:
cumulus@hostname:~$ ip -4 -brief addr show eno1
eno1 UP 10.20.16.248/24
Alternately, you can configure the interface with a static IP address by editing the /etc/netplan/01-ethernet.yaml Ubuntu Netplan configuration file.
For example, to set your network interface eno1 to a static IP address of 192.168.1.222 with gateway 192.168.1.1 and DNS server as 8.8.8.8 and 8.8.4.4:
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
addresses: [192.168.1.222/24]
gateway4: 192.168.1.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
Apply the settings.
cumulus@hostname:~$ sudo netplan apply
Repeat these steps for each of the worker node appliances.
Verify NetQ Software and Appliance Readiness
Now that the appliances are up and running, verify that the software is available and each appliance is ready for installation.
On the master NetQ Cloud Appliance, verify that the needed packages are present and of the correct release, version 4.3.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify the installation images are present and of the correct release, version 4.3.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0-opta.tgz
Verify the master NetQ Cloud Appliance is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
On one of your worker NetQ Cloud Appliances, verify that the needed packages are present and of the correct release, version 4.3 and update 34.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Make a note of the private IP addresses you assign to the master and worker nodes. You need them for later installation steps.
Verify that the needed packages are present and of the correct release, version 4.3.
cumulus@hostname:~$ dpkg -l | grep netq
ii netq-agent 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Telemetry Agent for Ubuntu
ii netq-apps 4.3.0-ub18.04u39~1659297239.34aa65d_amd64 Cumulus NetQ Fabric Validation Application for Ubuntu
Verify that the needed files are present and of the correct release.
cumulus@hostname:~$ cd /mnt/installables/
cumulus@hostname:/mnt/installables$ ls
NetQ-4.3.0-opta.tgz
Verify the appliance is ready for installation. Fix any errors indicated before installing the NetQ software.
cumulus@hostname:~$ sudo opta-check-cloud
Repeat Steps 4-8 for each additional worker NetQ Cloud Appliance.
The final step is to install and activate the NetQ software using the the CLI:
Run the following command on your master node to initialize the cluster. Copy the output of the command to use on your worker nodes:
cumulus@<hostname>:~$ netq install cluster master-init
Please run the following command on all worker nodes:
netq install cluster worker-init c3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFDM2NjTTZPdVVUWWJ5c2Q3NlJ4SHdseHBsOHQ4N2VMRWVGR05LSWFWVnVNcy94OEE4RFNMQVhKOHVKRjVLUXBnVjdKM2lnMGJpL2hDMVhmSVVjU3l3ZmhvVDVZM3dQN1oySVZVT29ZTi8vR1lOek5nVlNocWZQMDNDRW0xNnNmSzVvUWRQTzQzRFhxQ3NjbndIT3dwZmhRYy9MWTU1a
Run the netq install cluster worker-init <ssh-key> on each of your worker nodes.
Run the following command on your NetQ Cloud Appliance with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration.
You can specify the IP address instead of the interface name here: use ip-addr <IP address> in place of interface <ifname> above.
If you have changed the IP address or hostname of the NetQ OPTA after this step, you need to re-register this address with NetQ as follows:
Reset the VM:
cumulus@hostname:~$ netq bootstrap reset
Re-run the install CLI on the appliance. This example uses interface eth0. Replace this with your updated IP address, hostname or interface using the interface or ip-addr option.
If this step fails for any reason, you can run netq bootstrap reset and then try again.
Consider the following for container environments, and make adjustments as needed.
Flannel Virtual Networks
If you are using Flannel with a container environment on your network, you may need to change its default IP address ranges if they conflict with other addresses on your network. This can only be done one time during the first installation.
The address range is 10.244.0.0/16. NetQ overrides the original Flannel default, which is 10.1.0.0/16.
To change the default address range, use the install CLI with the pod-ip-range option. For example:
The default Docker bridge interface is disabled in NetQ. If you need to reenable the interface, contact support.
Verify Installation Status
To view the status of the installation, use the netq show status [verbose] command. The following example shows a successful on-premise installation:
State: Active
Version: 4.3.0
Installer Version: 4.3.0
Installation Type: Standalone
Activation Key: PKrgipMGEhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIixUQmFLTUhzZU80RUdTL3pOT01uQ2lnRnrrUhTbXNPUGRXdnUwTVo5SEpBPTIHZGVmYXVsdDoHbmV0cWRldgz=
Master SSH Public Key: a3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEazliekZDblJUajkvQVhOZ0hteXByTzZIb3Y2cVZBWFdsNVNtKzVrTXo3dmMrcFNZTGlOdWl1bEhZeUZZVDhSNmU3bFdqS3NrSE10bzArNFJsQVd6cnRvbVVzLzlLMzQ4M3pUMjVZQXpIU2N1ZVhBSE1TdTZHZ0JyUkpXYUpTNjJ2RTkzcHBDVjBxWWJvUFo3aGpCY3ozb0VVWnRsU1lqQlZVdjhsVjBNN3JEWW52TXNGSURWLzJ2eks3K0x2N01XTG5aT054S09hdWZKZnVOT0R4YjFLbk1mN0JWK3hURUpLWW1mbTY1ckoyS1ArOEtFUllrr5TkF3bFVRTUdmT3daVHF2RWNoZnpQajMwQ29CWDZZMzVST2hDNmhVVnN5OEkwdjVSV0tCbktrWk81MWlMSDAyZUpJbXJHUGdQa2s1SzhJdGRrQXZISVlTZ0RwRlpRb3Igcm9vdEBucXRzLTEwLTE4OC00NC0xNDc=
Is Cloud: False
Cluster Status:
IP Address Hostname Role Status
------------- ------------- ------ --------
10.188.44.147 10.188.44.147 Role Ready
NetQ... Active
Run the netq show opta-health command to verify all applications are operating properly. Allow 10-15 minutes for all applications to come up and report their status.
If any of the applications or services display Status as DOWN after 30 minutes, open a support ticket and attach the output of the opta-support command.
Install NetQ Agents
After installing the NetQ software, you should install the NetQ Agents on each switch you want to monitor. You can install NetQ Agents on switches and servers running:
Cumulus Linux 3.7.12 and later
SONiC 202012 and later
CentOS 7
RHEL 7.1
Ubuntu 18.04
Prepare for NetQ Agent Installation
For switches running Cumulus Linux and SONiC, you need to:
Install and configure NTP, if needed
Obtain NetQ software packages
For servers running RHEL, CentOS, or Ubuntu, you need to:
Verify you installed the minimum package versions
Verify the server is running lldpd
Install and configure NTP, if needed
Obtain NetQ software packages
If your network uses a proxy server for external connections, you should first
configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.
Verify NTP Is Installed and Configured
Verify that
NTP is running on the switch. The switch must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.
cumulus@switch:~$ sudo systemctl status ntp
[sudo] password for cumulus:
● ntp.service - LSB: Start NTP daemon
Loaded: loaded (/etc/init.d/ntp; bad; vendor preset: enabled)
Active: active (running) since Fri 2018-06-01 13:49:11 EDT; 2 weeks 6 days ago
Docs: man:systemd-sysv-generator(8)
CGroup: /system.slice/ntp.service
└─2873 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -c /var/lib/ntp/ntp.conf.dhcp -u 109:114
If NTP is not installed, install and configure it before continuing.
If NTP is not running:
Verify the IP address or hostname of the NTP server in the /etc/ntp.conf file, and then
Reenable and start the NTP service using the systemctl [enable|start] ntp commands
If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.
Obtain NetQ Agent Software Package
To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the NVIDIA networking repository.
To obtain the NetQ Agent package:
Edit the /etc/apt/sources.list file to add the repository for NetQ.
Note that NetQ has a separate repository from Cumulus Linux.
cumulus@switch:~$ sudo nano /etc/apt/sources.list
...
deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-4.3
...
You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-latest repository if you want to always retrieve the latest posted version of NetQ.
Cumulus Linux 4.4 and later includes the netq-agent package by default.
To add the repository, uncomment or add the following line in /etc/apt/sources.list:
cumulus@switch:~$ sudo nano /etc/apt/sources.list
...
deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.3
...
You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository if you want to always retrieve the latest posted version of NetQ.
Add the apps3.cumulusnetworks.com authentication key to Cumulus Linux:
Verify that
NTP is running on the switch. The switch must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.
admin@switch:~$ sudo systemctl status ntp
● ntp.service - Network Time Service
Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-06-08 14:56:16 UTC; 2min 18s ago
Docs: man:ntpd(8)
Process: 1444909 ExecStart=/usr/lib/ntp/ntp-systemd-wrapper (code=exited, status=0/SUCCESS)
Main PID: 1444921 (ntpd)
Tasks: 2 (limit: 9485)
Memory: 1.9M
CGroup: /system.slice/ntp.service
└─1444921 /usr/sbin/ntpd -p /var/run/ntpd.pid -x -u 106:112
If NTP is not installed, install and configure it before continuing.
If NTP is not running:
Verify the IP address or hostname of the NTP server in the /etc/sonic/config_db.json file, and then
Reenable and start the NTP service using the sudo config reload -n command
Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.
admin@switch:~$ show ntp
MGMT_VRF_CONFIG is not present.
synchronised to NTP server (104.194.8.227) at stratum 3
time correct to within 2014 ms
polling server every 64 s
remote refid st t when poll reach delay offset jitter
==============================================================================
-144.172.118.20 139.78.97.128 2 u 26 64 377 47.023 -1798.1 120.803
+208.67.75.242 128.227.205.3 2 u 32 64 377 72.050 -1939.3 97.869
+216.229.4.66 69.89.207.99 2 u 160 64 374 41.223 -1965.9 83.585
*104.194.8.227 164.67.62.212 2 u 33 64 377 9.180 -1934.4 97.376
Obtain NetQ Agent Software Package
To install the NetQ Agent you need to install netq-agent on each switch or host. This is available from the NVIDIA networking repository.
Note that NetQ has a separate repository from SONiC.
To obtain the NetQ Agent package:
Install the wget utility so you can install the GPG keys in step 3.
Before you install the NetQ Agent on a Red Hat or CentOS server, make sure you install and run at least the minimum versions of the following packages:
iproute-3.10.0-54.el7_2.1.x86_64
lldpd-0.9.7-5.el7.x86_64
ntp-4.2.6p5-25.el7.centos.2.x86_64
ntpdate-4.2.6p5-25.el7.centos.2.x86_64
Verify the Server is Running lldpd and wget
Make sure you are running lldpd, not lldpad. CentOS does not include lldpd by default, nor does it include wget; however,the installation requires it.
To install this package, run the following commands:
If NTP is not already installed and configured, follow these steps:
Install
NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.
root@ubuntu:~# sudo apt-get install ntp
Configure the network time server.
Open the /etc/ntp.conf file in your text editor of choice.
Under the Server section, specify the NTP server IP address or hostname.
Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:
root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
...
deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
...
The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even for a major version update. If you want to keep the repository on a specific version — such as netq-4.3 — use that instead.
Install NetQ Agent
After completing the preparation steps, you can successfully install the agent onto your switch or host.
Cumulus Linux 4.4 and later includes the netq-agent package by default. To install the NetQ Agent on earlier versions of Cumulus Linux:
Update the local apt repository, then install the NetQ software on the switch.
Continue with NetQ Agent Configuration in the next section.
Configure NetQ Agent
After you install the NetQ Agents on the switches you want to monitor, you must configure them to obtain useful and relevant data.
The NetQ Agent is aware of and communicates through the designated VRF. If you do not specify one, it uses the default VRF (named default). If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.
If you configure the NetQ Agent to communicate in a VRF that is not default or mgmt, the following line must be added to /etc/netq/netq.yml in the netq-agent section:
netq-agent:
netq_stream_address: 0.0.0.0
Two methods are available for configuring a NetQ Agent:
Edit the configuration file on the switch, or
Use the NetQ CLI
Configure NetQ Agents Using a Configuration File
You can configure the NetQ Agent in the netq.yml configuration file contained in the /etc/netq/ directory.
Open the netq.yml file using your text editor of choice. For example:
sudo nano /etc/netq/netq.yml
Locate the netq-agent section, or add it.
Set the parameters for the agent as follows:
port: 31980 (default configuration)
server: IP address of the NetQ Appliance or VM where the agent should send its collected data
If you configured the NetQ CLI, you can use it to configure the NetQ Agent to send telemetry data to the NetQ Appliance or VM. To configure the NetQ CLI, refer to Install NetQ CLI.
A couple of additional options are available for configuring the NetQ Agent. If you are using VRFs, you can configure the agent to communicate over a specific VRF. You can also configure the agent to use a particular port.
Configure the Agent to Use a VRF
By default, NetQ uses the default VRF for communication between the NetQ Appliance or VM and NetQ Agents. While optional, NVIDIA strongly recommends that you configure NetQ Agents to communicate with the NetQ Appliance or VM only via a
VRF, including a
management VRF. To do so, you need to specify the VRF name when configuring the NetQ Agent. For example, if you configured the management VRF and you want the agent to communicate with the NetQ Appliance or VM over it, configure the agent like this:
If you later change the VRF configured for the NetQ Agent (using a lifecycle management configuration profile, for example), you might cause the NetQ Agent to lose communication.
Configure the Agent to Communicate over a Specific Port
By default, NetQ uses port 31980 for communication between the NetQ Appliance or VM and NetQ Agents. If you want the NetQ Agent to communicate with the NetQ Appliance or VM via a different port, you need to specify the port number when configuring the NetQ Agent, like this:
sudo netq config add agent server 192.168.1.254 port 7379
sudo netq config restart agent
Configure the On-switch OPTA
On-switch OPTA functionality is an early access feature, and it does not support Flow Analysis or LCM.
On-switch OPTA is intended for use in small NetQ Cloud deployments where a dedicated OPTA might not be necessary. If you need help assessing the correct OPTA configuration for your deployment, contact your NVIDIA sales team.
Instead of installing a dedicated OPTA appliance, you can enable the OPTA service on every switch in your environment that will send data to the NetQ Cloud. To configure a switch for OPTA functionality, install the netq-opta package.
After the netq-opta package is installed, add your OPTA configuration key. Run the following command with the config-key obtained from the email you received from NVIDIA titled NetQ Access Link. You can also obtain the configuration key through the NetQ UI in the premise management configuration. For more information, see First Time Log In - NetQ Cloud.
The final steps is configuring the local NetQ agent on the switch to connect to the local OPTA service. Configure the agent on the switch to connect to localhost with the following command:
netq config add agent server localhost vrf mgmt
Install NetQ CLI
Installing the NetQ CLI on your NetQ Appliances, VMs, switches, or hosts is not required. However, the CLI can give you access to new features and bug fixes, and allows you to manage your network from multiple points in the network.
After installing the NetQ software and agent on each switch you want to monitor, you can also install the NetQ CLI on switches running:
Cumulus Linux 3.7.12 and later
SONiC 202012 and later
CentOS 7
RHEL 7.1
Ubuntu 18.04
If your network uses a proxy server for external connections, you should first
configure a global proxy so apt-get can access the software package in the NetQ repository.
Prepare for NetQ CLI Installation on a RHEL, CentOS, or Ubuntu Server
For servers running RHEL 7, CentOS or Ubuntu OS, you need to:
Verify you installed the minimum service packages versions
Verify the server is running lldpd
Install and configure NTP, if needed
Obtain NetQ software packages
These steps are not required for Cumulus Linux or SONiC.
Verify Service Package Versions
iproute-3.10.0-54.el7_2.1.x86_64
lldpd-0.9.7-5.el7.x86_64
ntp-4.2.6p5-25.el7.centos.2.x86_64
ntpdate-4.2.6p5-25.el7.centos.2.x86_64
iproute 1:4.3.0-1ubuntu3.16.04.1 all
iproute2 4.3.0-1ubuntu3 amd64
lldpd 0.7.19-1 amd64
ntp 1:4.2.8p4+dfsg-3ubuntu5.6 amd64
Verify What CentOS and Ubuntu Are Running
For CentOS and Ubuntu, make sure you are running lldpd, not lldpad. CentOS and Ubuntu do not include lldpd by default, even though the installation requires it. In addition, CentOS does not include wget, even though the installation requires it.
To install this package, run the following commands:
If you are running NTP in your out-of-band management network with VRF, specify the VRF (ntp@<vrf-name> versus just ntp) in the above commands.
Verify NTP is operating correctly. Look for an asterisk (*) or a plus sign (+) that indicates the clock synchronized with NTP.
root@rhel7:~# ntpq -pn
remote refid st t when poll reach delay offset jitter
==============================================================================
+173.255.206.154 132.163.96.3 2 u 86 128 377 41.354 2.834 0.602
+12.167.151.2 198.148.79.209 3 u 103 128 377 13.395 -4.025 0.198
2a00:7600::41 .STEP. 16 u - 1024 0 0.000 0.000 0.000
\*129.250.35.250 249.224.99.213 2 u 101 128 377 14.588 -0.299 0.243
Install
NTP on the server, if not already installed. Servers must be in time synchronization with the NetQ Platform or NetQ Appliance to enable useful statistical analysis.
root@ubuntu:~# sudo apt-get install ntp
Configure the network time server.
Open the /etc/ntp.conf file in your text editor of choice.
Under the Server section, specify the NTP server IP address or hostname.
Create the file /etc/apt/sources.list.d/cumulus-host-ubuntu-bionic.list and add the following line:
root@ubuntu:~# vi /etc/apt/sources.list.d/cumulus-apps-deb-bionic.list
...
deb [arch=amd64] https://apps3.cumulusnetworks.com/repos/deb bionic netq-latest
...
The use of netq-latest in these examples means that a get to the repository always retrieves the latest version of NetQ, even for a major version update. If you want to keep the repository on a specific version — such as netq-4.3 — use that instead.
Install NetQ CLI
Follow these steps to install the NetQ CLI on a switch or host.
To install the NetQ CLI you need to install netq-apps on each switch. This is available from the NVIDIA networking repository.
Cumulus Linux 4.4 and later includes the netq-apps package by default.
If your network uses a proxy server for external connections, you should first
configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.
To obtain the NetQ Agent package:
Edit the /etc/apt/sources.list file to add the repository for NetQ.
Note that NetQ has a separate repository from Cumulus Linux.
cumulus@switch:~$ sudo nano /etc/apt/sources.list
...
deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-3 netq-4.3
...
You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository if you want to always retrieve the latest posted version of NetQ.
Cumulus Linux 4.4 and later includes the netq-apps package by default.
To add the repository, uncomment or add the following line in /etc/apt/sources.list:
cumulus@switch:~$ sudo nano /etc/apt/sources.list
...
deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.3
...
You can use the deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest repository if you want to always retrieve the latest posted version of NetQ.
Update the local apt repository and install the software on the switch.
Continue with NetQ CLI configuration in the next section.
To install the NetQ CLI you need to install netq-apps on each switch. This is available from the NVIDIA networking repository.
If your network uses a proxy server for external connections, you should first
configure a global proxy so apt-get can access the software package in the NVIDIA networking repository.
To obtain the NetQ Agent package:
Edit the /etc/apt/sources.list file to add the repository for NetQ.
Continue with NetQ CLI configuration in the next section.
Configure the NetQ CLI
By default, you do not configure the NetQ CLI during the NetQ installation. The configuration resides in the /etc/netq/netq.yml file. Until the CLI is configured on a device, you can only run netq config and netq help commands, and you must use sudo to run them.
At minimum, you need to configure the NetQ CLI and NetQ Agent to communicate with the telemetry server. To do so, configure the NetQ Agent and the NetQ CLI so that they are running in the VRF where the routing tables have connectivity to the telemetry server (typically the management VRF).
To access and configure the CLI for your on-premises NetQ deployment, you must generate AuthKeys. You’ll need your username and password to generate them. These keys provide authorized access (access key) and user authentication (secret key).
To generate AuthKeys:
Enter your on-premises NetQ appliance hostname or IP address into your browser to open the NetQ UI login page.
Enter your username and password.
Expand the menu , and under Admin, select Management.
Select Manage on the User Accounts card.
Select your user and click above the table.
Copy these keys to a safe place. Select Copy to obtain the CLI configuration command to use on your devices.
The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.
You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:
store the file wherever you like, for example in /home/cumulus/ or /etc/netq
name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml
The following example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Replace the key values with your generated keys if you are using this example on your server.
This example uses an optional keys file. Replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.
If you have multiple premises and want to query data from a different premises than you originally configured, rerun the netq config add cli server command with the desired premises name. You can only view the data for one premises at a time with the CLI.
To access and configure the CLI for your on-premises NetQ deployment, you must generate AuthKeys. You’ll need your username and password to generate them. These keys provide authorized access (access key) and user authentication (secret key). Your credentials and NetQ Cloud addresses were obtained during first login to the NetQ Cloud and premises activation.
To generate AuthKeys:
Enter netq.nvidia.com into your browser to open the NetQ UI login page.
Enter your username and password.
Expand the menu , and under Admin, select Management.
Select Manage on the User Accounts card.
Select your user and click above the table.
Copy these keys to a safe place. Select Copy to obtain the CLI configuration command to use on your devices.
The secret key is only shown once. If you do not copy these, you will need to regenerate them and reconfigure CLI access.
You can also save these keys to a YAML file for easy reference, and to avoid having to type or copy the key values. You can:
store the file wherever you like, for example in /home/cumulus/ or /etc/netq
name the file whatever you like, for example credentials.yml, creds.yml, or keys.yml
The following example uses the individual access key, a premises of datacenterwest, and the default Cloud address, port and VRF. Replace the key values with your generated keys if you are using this example on your server.
sudo netq config add cli server api.netq.cumulusnetworks.com access-key 123452d9bc2850a1726f55534279dd3c8b3ec55e8b25144d4739dfddabe8149e secret-key /vAGywae2E4xVZg8F+HtS6h6yHliZbBP6HXU3J98765= premises datacenterwest
Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
sudo netq config restart cli
Restarting NetQ CLI... Success!
The following example uses an optional keys file. Replace the keys filename and path with the full path and name of your keys file, and the datacenterwest premises name with your premises name if you are using this example on your server.
sudo netq config add cli server api.netq.cumulusnetworks.com cli-keys-file /home/netq/nq-cld-creds.yml premises datacenterwest
Successfully logged into NetQ cloud at api.netq.cumulusnetworks.com:443
Updated cli server api.netq.cumulusnetworks.com vrf default port 443. Please restart netqd (netq config restart cli)
sudo netq config restart cli
Restarting NetQ CLI... Success!
If you have multiple premises and want to query data from a different premises than you originally configured, rerun the netq config add cli server command with the desired premises name. You can only view the data for one premises at a time with the CLI.
Add More Nodes to Your Server Cluster
You can add additional nodes to your server cluster on-premises and cloud deployments using the CLI:
Run the following CLI command to add a new worker node for on-premises deployments:
netq install cluster add-worker <text-worker-01>
Run the following CLI command to add a new worker node for cloud deployments:
The NetQ UI ships with a self-signed certificate that is sufficient for non-production environments or cloud deployments. For on-premises deployments, however, you receive a warning from your browser that this default certificate is not trusted when you first log in to the NetQ UI. You can avoid this by installing your own signed certificate.
If you already have a certificate installed and want to change or update it, run the kubectl delete secret netq-gui-ingress-tls [name] --namespace default command.
You need the following items to perform the certificate installation:
A valid X509 certificate.
A private key file for the certificate.
A DNS record name configured to access the NetQ UI.
The FQDN should match the common name of the certificate. If you use a wild card in the common name — for example, if the common name of the certificate is *.example.com — then the NetQ telemetry server should reside on a subdomain of that domain, accessible via a URL like netq.example.com.
A functioning and healthy NetQ instance.
You can verify this by running the netq show opta-health command.
Install a certificate using the NetQ CLI:
Log in to the NetQ On-premises Appliance or VM via SSH and copy your certificate and key file there.
Generate a Kubernetes secret called netq-gui-ingress-tls.
cumulus@netq-ts:~$ kubectl create secret tls netq-gui-ingress-tls \
--namespace default \
--key <name of your key file>.key \
--cert <name of your cert file>.crt
Verify that you created the secret successfully.
cumulus@netq-ts:~$ kubectl get secret
NAME TYPE DATA AGE
netq-gui-ingress-tls kubernetes.io/tls 2 5s
Update the ingress rule file to install self-signed certificates.
A message like the one here appears if your ingress rule is successfully configured.
Your custom certificate should now be working. Verify this by opening the NetQ UI at https://<your-hostname-or-ipaddr> in your browser.
Update Cloud Activation Key
NVIDIA provides a cloud activation key when you set up your premises. You use the cloud activation key (called the config-key) to access the cloud services. Note that these authorization keys are different from the ones you use to configure the CLI.
On occasion, you might want to update your cloud service activation key—for example, if you mistyped the key during installation and now your existing key does not work, or you received a new key for your premises from NVIDIA.
Update the activation key using the NetQ CLI:
Run the following command on your standalone or master NetQ Cloud Appliance or VM replacing text-opta-key with your new key.
This sections describes how to upgrade from your current installation to NetQ 4.3. Refer to the release notes before you upgrade.
You must upgrade your NetQ On-premises or Cloud Appliances or virtual machines (VMs). While there is some backwards compatability with the previous NetQ release for any version, upgrading NetQ Agents is always recommended. If you want access to new and updated commands, you can upgrade the CLI on your physical servers or VMs, and monitored switches and hosts as well.
To complete the upgrade for either an on-premises or a cloud deployment:
If you are upgrading NetQ Platform software for a NetQ On-premises Appliance or VM, select NetQ SW 4.3 Appliance to download the NetQ-4.3.0.tgz file. If you are upgrading NetQ software for a NetQ Cloud Appliance or VM, select NetQ SW 4.3 Appliance Cloud to download the NetQ-4.3.0-opta.tgz file.
If prompted, agree to the license agreement and proceed with the download.
For enterprise customers, if you do not see a link to the NVIDIA Licensing Portal on the NVIDIA Application Hub, contact NVIDIA support.
cumulus@<hostname>:~$ sudo apt-get install -y netq-agent netq-apps
Reading package lists... Done
Building dependency tree
Reading state information... Done
...
The following NEW packages will be installed:
netq-agent netq-apps
...
Fetched 39.8 MB in 3s (13.5 MB/s)
...
Unpacking netq-agent (4.3.0-ub18.04u39~1659297239.34aa65d) ...
...
Unpacking netq-apps (4.3.0-ub18.04u39~1659297239.34aa65d) ...
Setting up netq-apps (4.3.0-ub18.04u39~1659297239.34aa65d) ...
Setting up netq-agent (4.3.0-ub18.04u39~1659297239.34aa65d) ...
Processing triggers for rsyslog (8.32.0-1ubuntu4) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Run the Upgrade
Verify the following items before upgrading NetQ. For cluster deployments, verify steps 1 and 3 on all nodes in the cluster:
Check if enough disk space is available before you proceed with the upgrade:
cumulus@netq-appliance:~$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 248G 70G 179G 28% /
cumulus@netq-appliance:~$
The recommended Use% to proceed with installation is under 70%.
You can delete previous software tarballs in the /mnt/installables/ directory to regain some space.
If you can not bring disk space to under 70% usage, contact the NVIDIA support team.
Run the netq show opta-health command and check that all pods are in the READY state. If not, contact the NVIDIA support team.
Managing premises involves renaming existing premises or creating multiple premises.
Configure Multiple Premises
The NetQ Management dashboard lets you configure a single NetQ UI and CLI for monitoring data from multiple premises. This mean you do not need to log in to each premises to view the data.
There are two ways to implement a multi-site, on-premises deployment: either as a full deployment at each premises or as a full deployment at the primary site with a smaller deployment at secondary sites.
Full NetQ Deployment at Each Premises In this implementation, there is a NetQ appliance or VM running the NetQ Platform software with a database. Each premises operates independently, with its own NetQ UI and CLI. The NetQ appliance or VM at one of the deployments acts as the primary premises for the premises in the other deployments. A list of these secondary premises is stored with the primary deployment.
Full NetQ Deployment at Primary Site and Smaller Deployment at Secondary Sites In this implementation, there is a NetQ appliance or VM at one of the deployments acting as the primary premises for the premises in the other deployments. The primary premises runs the NetQ Platform software (including the NetQ UI and CLI) and houses the database. All other deployments are secondary premises; they run the NetQ Controller software and send their data to the primary premises for storage and processing. A list of these secondary premises is stored with the primary deployment.
After the multiple premises are configured, you can view this list of premises in the NetQ UI at the primary premises, change the name of premises on the list, and delete premises from the list.
To configure secondary premises so that you can view their data using the primary site NetQ UI, follow the instructions for the relevant deployment type of the secondary premises.
In this deployment model, each NetQ deployment can be installed separately. The data is stored and can be viewed from the NetQ UI at each premises.
To configure a these premises so that their data can be viewed from one premises:
On the workbench, under Premises, click .
Select Manage Premises, then External Premises.
Select Add External Premises.
Enter the IP address for the API gateway on the NetQ appliance or VM for one of the secondary premises.
Enter the access credentials for this host then click Next.
Select the premises you want to connect then click Finish.
Add additional secondary premises by clicking .
In this deployment model, the data is stored and can be viewed only from the NetQ UI at the primary premises.
The primary NetQ premises must be installed before the secondary premises can be added. For the secondary premises, create the premises here, then install them.
On the workbench, under Premises, click .
Click Manage Premises. Your primary premises (OPID0) is shown by default.
Click (Add Premises).
Enter the name of one of the secondary premises you want to add, then click Done.
Select the premises you just created.
Click to generate a configuration key.
Click Copy and save the key to a safe place, or click e-mail to send it to yourself or other administrator as appropriate. Then click Done
Rename a Premises
To rename an existing premises:
On the workbench, under Premises, click , then Manage Premises.
To rename an external premises, click External Premises.
On the right side of the screen, select a premises to rename, then click .
Enter the new name for the premises, then click Done.
System Server Information
To view the physical server or VM configuration:
Click menu .
Under Admin, select Management.
Locate the System Server Info card.
If no data is present on this card, it is likely that the NetQ Agent on your server or VM is not running properly or the underlying streaming services are impaired.
User Accounts and Permissions
Sign in to NetQ as an admin to view and manager users' accounts. If you are a user and would like to set individual preferences, visit Set User Preferences.
NetQ Management Workbench
Navigate to the the NetQ Management dashboard to complete the tasks outlined in this section. To get there, expand the menu on the NetQ dashboard and select Management.
Cloud NetQ Management Dashboard
Add a User Account
This section outlines the steps to add a local user. To add an LDAP user, refer to LDAP Authentication.
To add a new account:
On the User Accounts card, select Manage to open a table listing all user accounts.
Above the table, select to add a user.
Enter the fields and select Save.
Be especially careful entering the email address as you cannot change it once you save the account. If you save a mistyped email address, you must delete the account and create a new one.
Edit a User Account
As an admin, you can:
edit a user’s first or last name
reset a user’s password
change a user’s access type (user or admin)
You cannot edit a user’s email address, because this is the identifier the system uses for authentication. If you need to change an email address, delete the user and create a new one.
To edit an account:
On the User Accounts card, select Manage to open a table listing all user accounts.
Select the user whose account you’d like to edit. Above the table, click to edit the user’s account information.
Delete a User Account
To delete one or more user accounts:
On the User Accounts card, select Manage to open a table listing all user accounts.
Select one or more accounts. Above the table, click to delete the selected account(s).
View User Activity
Administrators can view user activity in the activity log.
To view the log, expand the menu on the NetQ dashboard and select Management. Under Admin select Activity Log to open a table listing user activity. Use the controls above the table to filter or export the data.
Manage Login Policies
Administrators can configure a session expiration time and the number of times users can refresh before requiring users to login again to NetQ.
To configure these login policies:
On the Login Management card, select Manage
Select how long a user can be logged in before logging in again The default time for on-premises deployments is 6 hours and for cloud is 30 minutes.
Click Update to save the changes.
The Login Management card shows the configuration.
Back Up and Restore NetQ
Back up your NetQ data according to your company policy. The following sections describe how to back up and restore your NetQ data for the NetQ On-premises Appliance and VMs.
These procedures do not apply to your NetQ Cloud Appliance or VM. The NetQ cloud service handles data backups automatically.
Back Up Your NetQ Data
NetQ stores its data in a Cassandra database. You perform backups by running scripts provided with the software and located in the /usr/sbin directory. When you run a backup, it creates a single tar file named netq_master_snapshot_<timestamp>.tar.gz on a local drive that you specify. NetQ supports one backup file and includes the entire set of data tables. A new backup replaces the previous backup.
If you select the rollback option during the lifecycle management upgrade process (the default behavior), LCM automatically creates a backup.
To manually create a backup:
Run the backup script to create a backup file in /opt/<backup-directory> being sure to replace the backup-directory option with the name of the directory you want to use for the backup file.
You can abbreviate the backup and localdir options of this command to -b and -l to reduce typing. If the backup directory identified does not already exist, the script creates the directory during the backup process.
This is a sample of what you see as the script is running:
[Fri 26 Jul 2019 02:35:35 PM UTC] - Received Inputs for backup ...
[Fri 26 Jul 2019 02:35:36 PM UTC] - Able to find cassandra pod: cassandra-0
[Fri 26 Jul 2019 02:35:36 PM UTC] - Continuing with the procedure ...
[Fri 26 Jul 2019 02:35:36 PM UTC] - Removing the stale backup directory from cassandra pod...
[Fri 26 Jul 2019 02:35:36 PM UTC] - Able to successfully cleanup up /opt/backuprestore from cassandra pod ...
[Fri 26 Jul 2019 02:35:36 PM UTC] - Copying the backup script to cassandra pod ....
/opt/backuprestore/createbackup.sh: line 1: cript: command not found
[Fri 26 Jul 2019 02:35:48 PM UTC] - Able to exeute /opt/backuprestore/createbackup.sh script on cassandra pod
[Fri 26 Jul 2019 02:35:48 PM UTC] - Creating local directory:/tmp/backuprestore/ ...
Directory /tmp/backuprestore/ already exists..cleaning up
[Fri 26 Jul 2019 02:35:48 PM UTC] - Able to copy backup from cassandra pod to local directory:/tmp/backuprestore/ ...
[Fri 26 Jul 2019 02:35:48 PM UTC] - Validate the presence of backup file in directory:/tmp/backuprestore/
[Fri 26 Jul 2019 02:35:48 PM UTC] - Able to find backup file:netq_master_snapshot_2019-07-26_14_35_37_UTC.tar.gz
[Fri 26 Jul 2019 02:35:48 PM UTC] - Backup finished successfully!
Verify the backup file creation was successful.
cumulus@switch:~$ cd /opt/<backup-directory>
cumulus@switch:~/opt/<backup-directory># ls
netq_master_snapshot_2019-06-04_07_24_50_UTC.tar.gz
To create a scheduled backup, add ./backuprestore.sh --backup --localdir /opt/<backup-directory> to an existing cron job, or create a new one.
Restore Your NetQ Data
You can restore NetQ data using the backup file you created in Back Up Your NetQ Data. You can restore your instance to the same NetQ Platform or NetQ Appliance or to a new platform or appliance. You do not need to stop the server where the backup file resides to perform the restoration, but logins to the NetQ UI fail during the restoration process. The restore option of the backup script copies the data from the backup file to the database, decompresses it, verifies the restoration, and starts all necessary services. You should not see any data loss as a result of a restore operation.
To restore NetQ on the same hardware where the backup file resides:
Run the restore script being sure to replace the backup-directory option with the name of the directory where the backup file resides.
You can abbreviate the restore and localdir options of this command to -r and -l to reduce typing.
This is a sample of what you see while the script is running:
[Fri 26 Jul 2019 02:37:49 PM UTC] - Received Inputs for restore ...
WARNING: Restore procedure wipes out the existing contents of Database.
Once the Database is restored you loose the old data and cannot be recovered.
"Do you like to continue with Database restore:[Y(yes)/N(no)]. (Default:N)"
You must answer the above question to continue the restoration. After entering Y or yes, the output continues as follows:
[Fri 26 Jul 2019 02:37:50 PM UTC] - Able to find cassandra pod: cassandra-0
[Fri 26 Jul 2019 02:37:50 PM UTC] - Continuing with the procedure ...
[Fri 26 Jul 2019 02:37:50 PM UTC] - Backup local directory:/tmp/backuprestore/ exists....
[Fri 26 Jul 2019 02:37:50 PM UTC] - Removing any stale restore directories ...
Copying the file for restore to cassandra pod ....
[Fri 26 Jul 2019 02:37:50 PM UTC] - Able to copy the local directory contents to cassandra pod in /tmp/backuprestore/.
[Fri 26 Jul 2019 02:37:50 PM UTC] - copying the script to cassandra pod in dir:/tmp/backuprestore/....
Executing the Script for restoring the backup ...
/tmp/backuprestore//createbackup.sh: line 1: cript: command not found
[Fri 26 Jul 2019 02:40:12 PM UTC] - Able to exeute /tmp/backuprestore//createbackup.sh script on cassandra pod
[Fri 26 Jul 2019 02:40:12 PM UTC] - Restore finished successfully!
To restore NetQ on new hardware:
Copy the backup file from /opt/<backup-directory> on the older hardware to the backup directory on the new hardware.
Run the restore script on the new hardware, being sure to replace the backup-directory option with the name of the directory where the backup file resides.
This section describes the various integrations you can configure after installing NetQ.
LDAP Authentication
As an administrator, you can integrate the NetQ role-based access control (RBAC) with your lightweight directory access protocol (LDAP) server in on-premises deployments. NetQ maintains control over role-based permissions for the NetQ application. Currently there are two roles, admin and user. With the RBAC integration, LDAP handles user authentication and your directory service, such as Microsoft Active Directory, Kerberos, OpenLDAP, and Red Hat Directory Service. A copy of each user from LDAP is stored in the local NetQ database.
Integrating with an LDAP server does not prevent you from configuring local users (stored and managed in the NetQ database) as well.
Get Started
LDAP integration requires information about how to connect to your LDAP server, the type of authentication you plan to use, bind credentials, and, optionally, search attributes.
Provide Your LDAP Server Information
To connect to your LDAP server, you need the URI and bind credentials. The URI identifies the location of the LDAP server. It comprises a FQDN (fully qualified domain name) or IP address, and the port of the LDAP server where the LDAP client can connect. For example: myldap.mycompany.com or 192.168.10.2. Typically you use port 389 for connection over TCP or UDP. In production environments, you deploy a secure connection with SSL. In this case, the port used is typically 636. Setting the Enable SSL toggle automatically sets the server port to 636.
Specify Your Authentication Method
There are two types of user authentication: anonymous and basic.
Anonymous: LDAP client does not require any authentication. The user can access all resources anonymously. This is not commonly used for production environments.
Basic: (Also called Simple) LDAP client must provide a bind DN and password to authenticate the connection. When selected, the Admin credentials appear: Bind DN and Bind Password. You define the distinguished name (DN) using a string of variables. Some common variables include:
Syntax
Description or Usage
cn
Common name
ou
Organizational unit or group
dc
Domain name
dc
Domain extension
Bind DN: DN of user with administrator access to query the LDAP server; used for binding with the server. For example, uid =admin,ou=ntwkops,dc=mycompany,dc=com.
Bind Password: Password associated with Bind DN.
The Bind DN and password get sent as clear text. Only users with these credentials can perform LDAP operations.
If you are unfamiliar with the configuration of your LDAP server, contact your administrator to ensure you select the appropriate authentication method and credentials.
Define User Attributes
You need the following two attributes to define a user entry in a directory:
Base DN: Location in directory structure where search begins. For example, dc=mycompany,dc=com.
User ID: Type of identifier used to specify an LDAP user. This can vary depending on the authentication service you are using. For example, you can use the user ID (UID) or email address with OpenLDAP, whereas you might use the sAMAccountName with Active Directory.
Optionally, you can specify the first name, last name, and email address of the user.
Set Search Attributes
While optional, specifying search scope indicates where to start and how deep a given user can search within the directory. You specify the data to search for in the search query.
Search scope options include:
Subtree: Search for users from base, subordinates at any depth (default)
Base: Search for users at the base level only; no subordinates
One level: Search for immediate children of user; not at base or for any descendants
Subordinate: Search for subordinates at any depth of user; but not at base
A typical search query for users could be {userIdAttribute}={userId}.
Create an LDAP Configuration
You can configure one LDAP server per bind DN (distinguished name). After you configure LDAP, you can verify the connectivity and save the configuration.
To create an LDAP configuration:
Click menu . Under Admin, select Management.
Locate the LDAP Server Info card, and click Configure LDAP.
Fill out the LDAP server configuration form according to your particular configuration.
Note: Items with an asterisk (*) are required. All others are optional.
Click Save to complete the configuration, or click Cancel to discard the configuration.
LDAP config cannot be changed after it is configured. If you need to change the configuration, you must delete the current LDAP configuration and create a new one. Note that if you change the LDAP server configuration, all users created against that LDAP server remain in the NetQ database and continue to be visible, but are no longer viable. You must manually delete those users if you do not want to see them.
Example LDAP Configurations
A variety of example configurations are provided here. Scenarios 1-3 are based on using an OpenLDAP or similar authentication service. Scenario 4 is based on using the Active Directory service for authentication.
Scenario 1: Base Configuration
In this scenario, we are configuring the LDAP server with anonymous authentication, a User ID based on an email address, and a search scope of base.
Parameter
Value
Host Server URL
ldap1.mycompany.com
Host Server Port
389
Authentication
Anonymous
Base DN
dc=mycompany,dc=com
User ID
email
Search Scope
Base
Search Query
{userIdAttribute}={userId}
Scenario 2: Basic Authentication and Subset of Users
In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network operators group, and a limited search scope.
Parameter
Value
Host Server URL
ldap1.mycompany.com
Host Server Port
389
Authentication
Basic
Admin Bind DN
uid =admin,ou=netops,dc=mycompany,dc=com
Admin Bind Password
nqldap!
Base DN
dc=mycompany,dc=com
User ID
UID
Search Scope
One Level
Search Query
{userIdAttribute}={userId}
Scenario 3: Scenario 2 with Widest Search Capability
In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the network administrators group, and an unlimited search scope.
Parameter
Value
Host Server URL
192.168.10.2
Host Server Port
389
Authentication
Basic
Admin Bind DN
uid =admin,ou=netadmin,dc=mycompany,dc=com
Admin Bind Password
1dap*netq
Base DN
dc=mycompany, dc=net
User ID
UID
Search Scope
Subtree
Search Query
userIdAttribute}={userId}
Scenario 4: Scenario 3 with Active Directory Service
In this scenario, we are configuring the LDAP server with basic authentication, for access only by the persons in the given Active Directory group, and an unlimited search scope.
Parameter
Value
Host Server URL
192.168.10.2
Host Server Port
389
Authentication
Basic
Admin Bind DN
cn=netq,ou=45,dc=mycompany,dc=com
Admin Bind Password
nq&4mAd!
Base DN
dc=mycompany, dc=net
User ID
sAMAccountName
Search Scope
Subtree
Search Query
{userIdAttribute}={userId}
Add LDAP Users to NetQ
Click menu . Under Admin, select Management.
Locate the User Accounts card, and click Manage.
On the User Accounts tab, click Add User.
Select LDAP User, then enter the user’s ID.
Enter your administrator password, then select Search.
If the user is found, the email address, first, and last name fields are automatically populated. If searching is not enabled on the LDAP server, you must enter the information manually.
If the fields are not automatically filled in, and searching is enabled on the LDAP server, you might require changes to the mapping file.
Select the NetQ user role for this user, admin or user, in the User Type dropdown.
Enter your admin password, and click Save, or click Cancel to discard the user account.
LDAP user passwords are not stored in the NetQ database and are always authenticated against LDAP.
Repeat these steps to add additional LDAP users.
Remove LDAP Users from NetQ
You can remove LDAP users in the same manner as local users.
Click menu . Under Admin, select Management.
Locate the User Accounts card, and click Manage.
Select the user or users you want to remove.
Click in the Edit menu.
If you delete an LDAP user in LDAP it is not automatically deleted from NetQ; however, the login credentials for these LDAP users stop working immediately.
Integrate NetQ with Grafana
Switches collect statistics about the performance of their interfaces. The NetQ Agent on each switch collects these statistics every 15 seconds and then sends them to your NetQ Appliance or Virtual Machine.
NetQ collects statistics for physical interfaces; it does not collect statistics for virtual interfaces, such as bonds, bridges, and VXLANs.
NetQ displays:
Transmit with tx_ prefix: bytes, carrier, colls, drop, errs, packets
Receive with rx_ prefix: bytes, drop, errs, frame, multicast, packets
You can use Grafana, an open source analytics and monitoring tool, to view these statistics. The fastest way to achieve this is by installing Grafana on an application server or locally per user, and then installing the NetQ plugin.
If you do not have Grafana installed already, refer to grafana.com for instructions on installing and configuring the Grafana tool.
Install NetQ Plugin for Grafana
Use the Grafana CLI to install the NetQ plugin. For more detail about this command, refer to the Grafana CLI documentation.
The Grafana plugin comes unsigned. Before you can install it, you need to update the grafana.ini file then restart the Grafana service:
Edit /etc/grafana/grafana.ini and add allow_loading_unsigned_plugins = netq-dashboard to the file.
Cumulus in the Cloud (CITC): plugin.air.netq.nvidia.com
Select procdevstats from the Module dropdown.
Enter your credentials (the ones used to log in).
For NetQ cloud deployments only, if you have more than one premises configured, you can select the premises you want to view, as follows:
If you leave the Premises field blank, the first premises name is selected by default
If you enter a premises name, that premises is selected for viewing
Note: If multiple premises are configured with the same name, then the first premises of that name is selected for viewing
Click Save & Test.
Create Your NetQ Dashboard
With the data source configured, you can create a dashboard with the transmit and receive statistics of interest to you.
Create a Dashboard
Click to open a blank dashboard.
Click (Dashboard Settings) at the top of the dashboard.
Add Variables
Click Variables.
Enter hostname into the Name field.
Enter hostname into the Label field.
Select Net-Q from the Data source list.
Select On Dashboard Load from the Refresh list.
Enter hostname into the Query field.
Click Add.
You should see a preview at the bottom of the hostname values.
Click Variables to add another variable for the interface name.
Enter ifname into the Name field.
Enter ifname into the Label field.
Select Net-Q from the Data source list.
Select On Dashboard Load from the Refresh list.
Enter ifname into the Query field.
Click Add.
You should see a preview at the bottom of the ifname values.
Click Variables to add another variable for metrics.
Enter metrics into the Name field.
Enter metrics into the Label field.
Select Net-Q from the Data source list.
Select On Dashboard Load from the Refresh list.
Enter metrics into the Query field.
Click Add.
You should see a preview at the bottom of the metrics values.
Add Charts
Now that the variables are defined, click to return to the new dashboard.
Click Add Query.
Select Net-Q from the Query source list.
Select the interface statistic you want to view from the Metric list.
Click the General icon.
Select hostname from the Repeat list.
Set any other parameters around how to display the data.
Return to the dashboard.
Select one or more hostnames from the hostname list.
Select one or more interface names from the ifname list.
Selectric one or more metrics to display for these hostnames and interfaces from the metrics list.
The following example shows a dashboard with two hostnames, two interfaces, and one metric selected. The more values you select from the variable options, the more charts appear on your dashboard.
Analyze the Data
When you have configured the dashboard, you can start analyzing the data. You can explore the data by modifying the viewing paramters in one of several ways using the dashboard tool set:
Select a different time period for the data by clicking the forward or back arrows. The default time range is dependent on the width of your browser window.
Zoom in on the dashboard by clicking the magnifying glass.
Manually refresh the dashboard data, or set an automatic refresh rate for the dashboard from the down arrow.
Add additional panels.
Click any chart title to edit or remove it from the dashboard.
Rename the dashboard by clicking the cog wheel and entering the new name.
SSO Authentication
You can integrate your NetQ Cloud deployment with a Microsoft Azure Active Directory (AD) or Google Cloud authentication server to support single sign-on (SSO) to NetQ. NetQ supports integration with SAML (Security Assertion Markup Language), OAuth (Open Authorization), and multi-factor authentication (MFA). Only one SSO configuration can be configured at a time.
You can create local user accounts with default access roles by enabling SSO. After enabling SSO, users logging in for the first time can sign up for SSO through the NetQ login screen or with a link provided by an admin.
Add SSO Configuration and User Accounts
To integrate your authentication server:
Expand the menu on the NetQ dashboard.
Under Admin, select Management. Locate the SSO Configuration card and select Manage.
Select either SAML or OpenID (which uses OAuth with OpenID Connect)
Specify the parameters:
You need several pieces of data from your Microsoft Azure or Google account and authentication server to complete the integration.
SSO Organization is typically a company’s name. The name entered in this field will appear in the SSO signup URL.
Access Type is the role (either user or admin) automatically assigned to users when they initalize their account via SSO login.
Name is a unique name for the SSO configuration.
Client ID is the identifier for your resource server.
Client Secret is the secret key for your resource server.
Authorization Endpoint is the URL of the authorization application.
Token Endpoint is the URL of the authorization token.
Select Test to verify the configuration and ensure that you can log in. If it is not working, you are logged out. Check your specification and retest the configuration until it is working properly.
Select Close. The card reflects the configuration:
To require users to log in using this SSO configuration, select change under the current Disabled status and confirm. The card reflects that SSO is enabled:
After an admin has configured and enabled SSO, users logging in for the first time can sign up for SSO.
Select Test to verify the configuration and ensure that you can log in. If it is not working, you are logged out. Check your specification and retest the configuration until it is working properly.
Select Close. The card reflects the configuration:
To require users to log in using this SSO configuration, select change under the current Disabled status and confirm. The card reflects that SSO is enabled.
Select Submit to enable the configuration. The SSO card reflects this new status:
After an admin has configured and enabled SSO, users logging in for the first time can sign up for SSO.
The SSO organization you entered during the configuration will replace SSO_Organization in the URL.
Modify Configuration
You can change the specifications for SSO integration with your authentication server at any time, including changing to an alternate SSO type, disabling the existing configuration, or reconfiguring SSO.
Change SSO Type
From the SSO Configuration card:
Select Disable, then Yes.
Select Manage then select the desired SSO type and complete the form.
Copy the redirect URL on the success dialog into your identity provider configuration.
Select Test to verify that the login is working. Modify your specification and retest the configuration until it is working properly.
Select Update.
Disable SSO Configuration
From the SSO Configuration card:
Select Disable.
Select Yes to disable the configuration, or Cancel to keep it enabled.
Uninstall NetQ
You can remove the NetQ software from your system server and switches when necessary.
Remove the NetQ Agent and CLI
Use the apt-get purge command to remove the NetQ Agent or CLI package from a Cumulus Linux switch or an Ubuntu host.
cumulus@switch:~$ sudo apt-get update
cumulus@switch:~$ sudo apt-get purge netq-agent netq-apps
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
netq-agent* netq-apps*
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
After this operation, 310 MB disk space will be freed.
Do you want to continue? [Y/n] Y
Creating pre-apt snapshot... 2 done.
(Reading database ... 42026 files and directories currently installed.)
Removing netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
/usr/sbin/policy-rc.d returned 101, not running 'stop netq-agent.service'
Purging configuration files for netq-agent (3.0.0-cl3u27~1587646213.c5bc079) ...
dpkg: warning: while removing netq-agent, directory '/etc/netq/config.d' not empty so not removed
Removing netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
/usr/sbin/policy-rc.d returned 101, not running 'stop netqd.service'
Purging configuration files for netq-apps (3.0.0-cl3u27~1587646213.c5bc079) ...
dpkg: warning: while removing netq-apps, directory '/etc/netq' not empty so not removed
Processing triggers for man-db (2.7.0.2-5) ...
grep: extra.services.enabled: No such file or directory
Creating post-apt snapshot... 3 done.
If you only want to remove the agent or the CLI, but not both, specify just the relevant package in the apt-get purge command.
To verify the removal of the packages from the switch, run:
cumulus@switch:~$ dpkg-query -l netq-agent
dpkg-query: no packages found matching netq-agent
cumulus@switch:~$ dpkg-query -l netq-apps
dpkg-query: no packages found matching netq-apps
Use the yum remove command to remove the NetQ agent or CLI package from a RHEL7 or CentOS host.
Verify the removal of the packages from the switch.
cumulus@switch:~$ dpkg-query -l netq-agent
dpkg-query: no packages found matching netq-agent
cumulus@switch:~$ dpkg-query -l netq-apps
dpkg-query: no packages found matching netq-apps
Delete the virtual machine according to the usual VMware or KVM practice.
Delete a virtual machine from the host computer using one of the following methods:
Right-click the name of the virtual machine in the Favorites list, then select Delete from Disk
Select the virtual machine and choose VM > Delete from disk
Delete a virtual machine from the host computer using one of the following methods:
Run virsch undefine <vm-domain> --remove-all-storage
Run virsh undefine <vm-domain> --wipe-storage
Configuration Management
From initial configuration and provisioning of devices to events and notifications, administrators and operators are responsible for setting up and managing the configuration of the network. The topics in this section provide instructions for managing the NetQ UI, physical and software inventory, events and notifications, and for provisioning your devices and network.
User Management
As an admin, you can manage you users and authentication settings from the NetQ Management dashboard.
NetQ enables you to provision your switches using the lifecycle management feature in the NetQ UI or the NetQ CLI. Also included here are management procedures for NetQ Agents and optional post-installation configurations.
Manage Switches through Their Lifecycle
Only administrative users can perform the tasks described in this topic.
As an administrator, you want to manage the deployment of NVIDIA product software onto your network devices (servers, appliances, and switches) in the most efficient way and with the most information about the process as possible.
Using the NetQ UI or CLI, lifecycle management enables you to:
Manage Cumulus Linux and NetQ images in a local repository
Configure switch access credentials (required for installations and upgrades)
Manage Cumulus Linux switch inventory and roles
Create snapshots of the network state at various times
Create NetQ configuration profiles
Upgrade NetQ (Agents and CLI) on Cumulus Linux switches running NetQ Agents
Install or upgrade NetQ (Agents and CLI) on Cumulus Linux switches
View a result history of upgrade attempts
This feature is fully enabled for on-premises deployments and fully disabled for cloud deployments. Contact your local NVIDIA sales representative or submit a support ticket to activate LCM on cloud deployments.
Access Lifecycle Management Features
To manage the various lifecycle management features using the NetQ UI, open the Manage Switch Assets page in one of the following ways:
Click , then select Manage Switches
Click in a workbench header
Click (Devices) in a workbench header, then select Manage switches
The Manage Switch Assets view provides access to switch management, image management, and configuration management features as well as job history. Each tab provides cards that let the administrator manage the relevant aspect of switch assets.
To manage the various lifecycle management features using the NetQ CLI, use the netq lcm command set.
LCM Summary
This table summarizes the UI cards and CLI commands available for the LCM feature.
Function
Description
NetQ UI Cards
NetQ CLI Commands
Switch Management
Discover switches, view switch inventory, assign roles, set user access credentials, perform software installation and upgrade networkwide
Switches
Access
netq lcm show switches
netq lcm add role
netq lcm upgrade
netq lcm add/del/show credentials
netq lcm discover
Image Management
View, add, and remove images for software installation and upgrade
Cumulus Linux Images
NetQ Images
netq lcm add/del/show netq-image
netq lcm add/del/show cl-images
netq lcm add/show default-version
Job History
View the results of installation, upgrade, and configuration assignment jobs
CL Upgrade History
NetQ Install and Upgrade History
Config Assignment History
netq lcm show status
netq lcm show upgrade-jobs
NetQ and Network OS Images
NetQ and network OS (Cumulus Linux and SONiC) images are managed with LCM. This section details how to view images, check for missing images, and upgrade images.
The network OS and NetQ images are available in several variants based on the software version (x.y.z), the CPU architecture (ARM, x86), platform (based on ASIC vendor), SHA checksum, and so forth. When LCM discovers Cumulus Linux switches running NetQ in your network, it extracts the metadata needed to select the appropriate image for a given switch. Similarly, LCM discovers and extracts the metadata from NetQ images.
The Cumulus Linux Images and NetQ Images cards in the NetQ UI provide a summary of image status in LCM. They show the total number of images in the repository, a count of missing images, and the starting points for adding and managing your images.
The netq lcm show cl-images and netq lcm show netq-images commands also display a summary of the Cumulus Linux or NetQ images, respectively, uploaded to the LCM repo on the NetQ appliance or VM.
Default Version Assignment
You can assign a specific OS or NetQ version as the default version to use when installing or upgrading switches. We recommended that you choose the newest version that you intend to install or upgrade on all, or the majority, of your switches. The default selection can be overridden during individual installation and upgrade job creation if an alternate version is needed for a given set of switches.
Missing Images
You should upload images for each network OS and NetQ version currently installed in your inventory so you can support rolling back to a known good version should an installation or upgrade fail. The NetQ UI prompts you to upload any missing images to the repository.
For example, if you have both Cumulus Linux 3.7.3 and 3.7.11 versions, some running on ARM and some on x86 architectures, then LCM verifies the presence of each of these images. If only the 3.7.3 x86, 3.7.3 ARM, and 3.7.11 x86 images are in the repository, the NetQ UI would list the 3.7.11 ARM image as missing. For NetQ, you need both the netq-apps and netq-agent packages for each release variant.
If you have specified a default network OS and/or NetQ version, the NetQ UI also verifies that the necessary versions of the default image are available based on the known switch inventory, and if not, lists those that are missing.
While you do not have to upload images that NetQ determines to be missing, not doing so might cause failures when you attempt to upgrade your switches.
Upload Images
For fresh installations of NetQ 4.3, no images have yet been uploaded to the LCM repository. If you are upgrading from NetQ 3.0.x-3.2.x, the Cumulus Linux images you have previously added are still present.
In preparation for Cumulus Linux upgrades, the recommended image upload flow is:
In a fresh NetQ install, add images that match your current inventory.
Use the following instructions to upload missing network OS and NetQ images.
For network OS images:
On the Manage Switch Assets page, select Upgrade, then Image Management.
On the Cumulus Linux Images card, select View # missing CL images to see which images you need. This opens a list of missing images.
If you have already specified a default image, you must click Manage and then Missing to see the missing images.
Select one or more of the missing images and make note of the version, ASIC vendor, and CPU architecture for each.
Note the Disk Space Utilized information in the header to verify that you have enough space to upload disk images.
Download the network OS disk images (.bin files) from the NVIDIA Enterprise Support Portal. Log into the portal and from the Downloads tab, select Switches and Gateways. Under Switch Software, click All downloads next to Cumulus Linux for Mellanox Switches. Select the current version and the target version, then click Show Downloads Path. Download the file.
Back in the UI, select (Add Image) above the table.
Provide the .bin file from an external drive that matches the criteria for the selected image(s), either by dragging and dropping or by selecting it from a directory.
Click Import.
If the upload was successful, you will receive a confirmation dialog.
If the upload was not successful, an Image Import Failed message appears. Close the Import Image dialog and try uploading the file again.
Click Done.
Click Uploaded to verify the image is in the repository.
Click to return to the LCM dashboard.
The Cumulus Linux Images card now shows the number of images you uploaded.
Download the network OS disk images (.bin files) from the NVIDIA Enterprise Support Portal. Log into the portal and from the Downloads tab, select Switches and Gateways. Under Switch Software, click All downloads next to Cumulus Linux for Mellanox Switches. Select the current version and the target version, then click Show Downloads Path. Download the file.
Upload the images to the LCM repository. This example uses a Cumulus Linux 4.2.0 disk image.
Repeat step 2 for each image you need to upload to the LCM repository.
For NetQ images:
Click Upgrade, then click Image Management.
On the NetQ Images card, select View # missing NetQ images to see which images you need. This opens a list of missing images.
If you have already specified a default image, you must click Manage and then Missing to see the missing images.
Select one or all of the missing images and make note of the OS version, CPU architecture, and image type. Remember that you need both netq-apps and neta-agent for NetQ to perform the installation or upgrade.
Download the NetQ Debian packages needed for upgrade from the NetQ repository, selecting the appropriate OS version and architecture. Place the files in an accessible part of your local network.
Back in the UI, click (Add Image) above the table.
Provide the .deb file(s) from an external drive that matches the criteria for the selected image, either by dragging and dropping it onto the dialog or by selecting it from a directory.
Click Import.
On successful completion, you receive confirmation of the upload.
If the upload was not successful, an Image Import Failed message appears. Close the Import Image dialog and try uploading the file again.
Click Done.
Click Uploaded to verify the images are in the repository.
After you upload all the missing images, the Missing list is empty.
Click to return to the LCM dashboard.
The NetQ Images card now shows the number of images you uploaded.
Download the NetQ Debian packages needed for upgrade from the NetQ repository, selecting the appropriate version and hypervisor/platform. Place them in an accessible part of your local network.
Upload the images to the LCM repository. This example uploads the two packages (netq-agent and netq-apps) needed for NetQ version 4.0.0 for a NetQ appliance or VM running Ubuntu 18.04 with an x86 architecture.
To upload the network OS or NetQ images that you want to use for upgrade, first download the Cumulus Linux or SONiC disk images (.bin files) and NetQ Debian packages needed for upgrade from the NVIDIA Enterprise Support Portal and NetQ repository, respectively. Place them in an accessible part of your local network.
If you are upgrading the network OS on switches with different ASIC vendors or CPU architectures, you need more than one image. For NetQ, you need both the netq-apps and netq-agent packages for each variant.
Then continue with the instructions here based on whether you want to use the NetQ UI or CLI.
Click Upgrade, then click Image Management.
Click Add Image on the Cumulus Linux Images or NetQ Images card.
Provide one or more images from an external drive, either by dragging and dropping onto the dialog or by selecting from a directory.
Click Import.
Monitor the progress until it completes. Click Done.
Click to return to the LCM dashboard.
The NetQ Images card is updated to show the number of additional images you uploaded.
Use the netq lcm add cl-image <text-image-path> and netq lcm add netq-image <text-image-path> commands to upload the images. Run the relevant command for each image that needs to be uploaded.
Lifecycle management does not have a default network OS or NetQ upgrade version specified automatically. With the NetQ UI, you can specify the version that is appropriate for your network to ease the upgrade process.
To specify a default version in the NetQ UI:
Click Upgrade, then click Image Management.
Click the Click here to set the default CL version link in the middle of the Cumulus Linux Images card, or click the Click here to set the default NetQ version link in the middle of the NetQ Images card.
Select the version you want to use as the default for switch upgrades.
Click Save. The default version is now displayed on the relevant Images card.
In the CLI, you can check which version of the network OS or NetQ is the default.
To see which version of Cumulus Linux is the default, run netq lcm show default-version cl-images:
cumulus@switch:~$ netq lcm show default-version cl-images
ID Name CL Version CPU ASIC Last Changed
------------------------- --------------- ----------- -------- --------------- -------------------------
image_cc97be3955042ca4185 cumulus-linux-4 4.2.0 x86_64 VX Tue Jan 5 22:10:59 2021
7c4d0fe95296bcea3e372b437 .2.0-vx-amd64-1
a535a4ad23ca300d52c3 594775435.dirty
zc24426ca.bin
To see which version of NetQ is the default, run netq lcm show default-version netq-images:
cumulus@switch:~$ netq lcm show default-version netq-images
ID Name NetQ Version CL Version CPU Image Type Last Changed
------------------------- --------------- ------------- ----------- -------- -------------------- -------------------------
image_d23a9e006641c675ed9 netq-agent_4.0. 4.0.0 cl4u32 x86_64 NETQ_AGENT Tue Jan 5 22:23:50 2021
e152948a9d1589404e8b83958 0-cl4u32_160939
d53eb0ce7698512e7001 1187.7df4e1d2_a
md64.deb
image_68db386683c796d8642 netq-apps_4.0.0 4.0.0 cl4u32 x86_64 NETQ_CLI Tue Jan 5 22:23:54 2021
2f2172c103494fef7a820d003 -cl4u32_1609391
de71647315c5d774f834 187.7df4e1d2_am
d64.deb
Export Images
You can export a listing of the network OS and NetQ images stored in the LCM repository for reference.
To export image listings:
Open the LCM dashboard.
Click Upgrade, then click Image Management.
Click Manage on the Cumulus Linux Images or NetQ Images card.
Optionally, use the filter option above the table on the Uploaded tab to narrow down a large listing of images.
Click above the table.
Choose the export file type and click Export.
Use the json option with the netq lcm show cl-images command to output a list of the Cumulus Linux image files stored in the LCM repository.
You must have switch access credentials to install and upgrade software on a switch. You can choose between basic authentication (SSH username/password) and SSH (Public/Private key) authentication. These credentials apply to all switches. If some of your switches have alternate access credentials, you must change them or modify the credential information before attempting installations or upgrades with the lifecycle management feature.
Specify Switch Credentials
Switch access credentials are not specified by default. You must add these.
To specify access credentials:
Open the LCM dashboard.
Click the Click here to add Switch access link on the Access card.
Select the authentication method you want to use; SSH or Basic Authentication. Basic authentication is selected by default.
Be sure to use credentials for a user account that has permission to configure switches.
The default credentials for Cumulus Linux have changed from cumulus/CumulusLinux! to cumulus/cumulus for releases 4.2 and later. For details, read Cumulus Linux User Accounts.
Enter a username.
Enter a password.
Click Save.
The Access card now indicates your credential configuration.
You must have sudoer permission to properly configure switches when using the SSH key method.
Create a pair of SSH private and public keys.
ssh-keygen -t rsa -C "<USER>"
Copy the SSH public key to each switch that you want to upgrade using one of the following methods:
Manually copy the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
Run ssh-copy-id USER@<switch_ip> on the server where you generated the SSH key pair for each switch
Copy the SSH private key into the entry field in the Create Switch Access card.
For security, your private key is stored in an encrypted format, and only provided to internal processes while encrypted.
The Access card now indicates your credential configuration.
The default credentials for Cumulus Linux have changed from cumulus/CumulusLinux! to cumulus/cumulus for releases 4.2 and later. For details, read Cumulus Linux User Accounts.
To configure SSH authentication using a public/private key:
You must have sudoer permission to properly configure switches when using the SSH Key method.
If the keys do not yet exist, create a pair of SSH private and public keys.
ssh-keygen -t rsa -C "<USER>"
Copy the SSH public key to each switch that you want to upgrade using one of the following methods:
Manually copy the SSH public key to the /home/<USER>/.ssh/authorized_keys file on each switch, or
Run ssh-copy-id USER@<switch_ip> on the server where you generated the SSH key pair for each switch
To change the basic authentication credentials, run the add credentials command with the new username and/or password. This example changes the password for the cumulus account created above:
You can remove the access credentials for switches using the NetQ CLI. Note that without valid credentials, you cannot upgrade your switches.
To remove the credentials, run netq lcm del credentials. Verify their removal by running netq lcm show credentials.
Switch Inventory and Roles
On initial installation, the lifecycle management feature provides an inventory of switches that have been automatically discovered by NetQ and are available for software installation or upgrade through NetQ. This includes all switches running Cumulus Linux 3.7.12 or later, SONiC 202012 or later, and NetQ Agent 4.1.0 or later in your network. You assign network roles to switches and select switches for software installation and upgrade from this inventory listing.
View the LCM Switch Inventory
You can view the switch inventory from the NetQ UI and the NetQ CLI.
A count of the switches NetQ was able to discover and the network OS versions that are running on those switches is available from the LCM dashboard.
To view a list of all switches known to lifecycle management, click Manage on the Switches card.
Review the list:
Sort the list by any column; hover over column title and click to toggle between ascending and descending order
Filter the list: click and enter parameter value of interest
If you have more than one network OS version running on your switches, you can click a version segment on the Switches card graph to open a list of switches pre-filtered by that version.
To view a list of all switches known to lifecycle management, run:
netq lcm show switches [version <text-cumulus-linux-version>] [json]
Use the version option to only show switches with a given network OS version, X.Y.Z.
This example shows all switches known by lifecycle management.
This listing is the starting point for network OS upgrades or NetQ installations and upgrades. If the switches you want to upgrade are not present in the list, you can:
Work with the list you have and add them later
Verify the missing switches are reachable using ping
Verify the NetQ Agent is fresh and version 4.1.0 or later for switches that already have the agent installed (click , then click Agents or run netq show agents)
Four pre-defined switch roles are available based on the Clos architecture: Superspine, Spine, Leaf, and Exit. With this release, you cannot create your own roles.
Switch roles:
Identify switch dependencies and determine the order in which switches get upgraded
Determine when to stop the process if a failure occurs
When you assign roles, the upgrade process begins with switches having the superspine role, then continues with the spine switches, leaf switches, exit switches, and finally switches with no role assigned. The upgrade process for all switches with a given role must be successful before upgrading switches with the closest dependent role can begin.
For example, you select a group of seven switches to upgrade. Three are spine switches and four are leaf switches. After you successfully upgrade all the spine switches, then you upgrade all the leaf switches. If one of the spine switches fails to upgrade, NetQ upgrades the other two spine switches, but the upgrade process stops after that, leaving the leaf switches untouched, and the upgrade job fails.
When only some of the selected switches have roles assigned in an upgrade job, NetQ upgrades the switches with roles first, then upgrades all switches with no roles assigned.
While role assignment is optional, using roles can prevent switches from becoming unreachable due to dependencies between switches or single attachments. And when you deploy MLAG pairs, switch roles avoid upgrade conflicts. For these reasons, NVIDIA highly recommends assigning roles to all your switches.
Assign Switch Roles
You can assign roles to one or more switches using the NetQ UI or the NetQ CLI.
Open the LCM dashboard.
On the Switches card, click Manage.
Select one switch or multiple switches to assign to the same role.
Click .
Select the role that applies to the selected switch(es).
Click Assign.
Note that the Role column is updated with the role assigned to the selected switch(es). To return to the full list of switches, click All.
Continue selecting switches and assigning roles until most or all switches have roles assigned.
A bonus of assigning roles to switches is that you can then filter the list of switches by their roles by clicking the appropriate tab.
To assign multiple switches to the same role, separate the hostnames with commas (no spaces). This example configures leaf01 through leaf04 switches with the leaf role:
netq lcm add role leaf switches leaf01,leaf02,leaf03,leaf04
View Switch Roles
You can view the roles assigned to the switches in the LCM inventory at any time.
Open the LCM dashboard.
On the Switches card, click Manage.
The assigned role appears in the Role column of the listing.
To view all switch roles, run:
netq lcm show switches [version <text-cumulus-linux-version>] [json]
Use the version option to only show switches with a given network OS version, X.Y.Z.
This example shows the role of all switches in the Role column of the listing.
If you accidentally assign an incorrect role to a switch, you can easily change it to the correct role.
To change a switch role:
Open the LCM dashboard.
On the Switches card, click Manage.
Select the switches with the incorrect role from the list.
Click .
Select the correct role. (Note that you can select No Role here as well to remove the role from the switches.)
Click Assign.
You use the same command to assign a role as you use to change the role.
For a single switch, run:
netq lcm add role exit switches border01
To assign multiple switches to the same role, separate the hostnames with commas (no spaces). For example:
cumulus@switch:~$ netq lcm add role exit switches border01,border02
Export List of Switches
Using the Switch Management feature you can export a listing of all or a selected set of switches.
To export the switch listing:
Open the LCM dashboard.
On the Switches card, click Manage.
Select one or more switches, filtering as needed, or select all switches (click ).
Click .
Choose the export file type and click Export.
Use the json option with the netq lcm show switches command to output a list of all switches in the LCM repository. Alternately, output only switches running a particular network OS version by including the version option.
cumulus@switch:~$ netq lcm show switches json
cumulus@switch:~$ netq lcm show switches version 3.7.11 json
Upgrade NetQ Agent Using LCM
The lifecycle management (LCM) feature enables you to upgrade to NetQ 4.3.0 on switches with an existing NetQ Agent using the NetQ UI. You can upgrade only the NetQ Agent or upgrade both the NetQ Agent and the NetQ CLI at the same time. You can run up to five jobs simultaneously; however, a given switch can only appear in one running job at a time.
The upgrade workflow includes the following steps:
Upgrades can be performed from NetQ Agents of 2.4.x and 3.0.x-3.2.x releases. Lifecycle management does not support upgrades from NetQ 2.3.1 or earlier releases; you must perform a new installation in these cases. Refer to Install NetQ Agents.
Prepare for a NetQ Agent Upgrade
Prepare for NetQ Agent upgrade on switches as follows:
You can upgrade NetQ Agents on switches as follows:
In the Switch Management tab, click Manage on the Switches card.
Select the individual switches (or click to select all switches) with older NetQ releases that you want to upgrade. Filter by role (on left) to narrow the listing and sort by column heading (such as hostname or IP address) to order the list in a way that helps you find the switches you want to upgrade.
Click (Upgrade NetQ) above the table.
From this point forward, the software walks you through the upgrade process, beginning with a review of the switches that you selected for upgrade.
Verify that the number of switches selected for upgrade matches your expectation.
Enter a name for the upgrade job. The name can contain a maximum of 22 characters (including spaces).
Review each switch:
Is the NetQ Agent version between 2.4.0 and 3.2.1? If not, this switch can only be upgraded through the switch discovery process.
Is the configuration profile the one you want to apply? If not, click Change config, then select an alternate profile to apply to all selected switches.
You can apply different profiles to switches in a single upgrade job by selecting a subset of switches (click checkbox for each switch) and then choosing a different profile. You can also change the profile on a per switch basis by clicking the current profile link and selecting an alternate one.
Scroll down to view all selected switches or use Search to find a particular switch of interest.
After you are satisfied with the included switches, click Next.
Review the summary indicating the number of switches and the configuration profile to be used. If either is incorrect, click Back and review your selections.
Select the version of NetQ Agent for upgrade. If you have designated a default version, keep the Default selection. Otherwise, select an alternate version by clicking Custom and selecting it from the list.
By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.
Click Next.
Several checks are performed to eliminate preventable problems during the upgrade process.
These checks verify the following when applicable:
Selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ Agent upgrade
Selected version of NetQ Agent is a valid upgrade path
All mandatory parameters have valid values, including MLAG configurations
All switches are reachable
The order to upgrade the switches, based on roles and configurations
If any of the pre-checks fail, review the error messages and take appropriate action.
If all of the pre-checks pass, click Upgrade to initiate the upgrade job.
Watch the progress of the upgrade job.
You can watch the detailed progress for a given switch by clicking .
Click to return to Switches listing.
For the switches you upgraded, you can verify the version is correctly listed in the NetQ_Version column. Click to return to the lifecycle management dashboard.
The NetQ Install and Upgrade History card is now visible in the Job History tab and shows the status of this upgrade job.
To upgrade the NetQ Agent on one or more switches, run:
This example creates a NetQ Agent upgrade job called upgrade-cl430-nq330. It upgrades the spine01 and spine02 switches with NetQ Agents version 4.1.0.
cumulus@switch:~$ netq lcm upgrade name upgrade-cl430-nq330 netq-version 4.1.0 hostnames spine01,spine02
Analyze the NetQ Agent Upgrade Results
After starting the upgrade you can monitor the progress in the NetQ UI. You can monitor the progress from the preview page or the Upgrade History page.
From the preview page, a green circle with rotating arrows appears for each switch as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the NetQ Install and Upgrade History page. The job started most recently appears at the top, and the data refreshes periodically.
If you get while the job is in progress, it might appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.
Monitor the NetQ Agent Upgrade Job
Several viewing options are available for monitoring the upgrade job.
Monitor the job with full details open:
Monitor the job with only summary information in the NetQ Install and Upgrade History page. Open this view by clicking in the full details view; useful when you have multiple jobs running simultaneously.
When multiple jobs are running, scroll down or use the filters above the jobs to find the jobs of interest:
Time Range: Enter a range of time in which the upgrade job was created, then click Done.
All switches: Search for or select individual switches from the list, then click Done.
All switch types: Search for or select individual switch series, then click Done.
All users: Search for or select individual users who created an upgrade job, then click Done.
All filters: Display all filters at once to apply multiple filters at once. Additional filter options are included here. Click Done when satisfied with your filter criteria.
By default, filters show all of that items of the given filter type until it is restricted by these settings.
Monitor the job through the NetQ Install and Upgrade History card in the Job History tab. Click twice to return to the LCM dashboard.
Sample Successful NetQ Agent Upgrade
This example shows that all four of the selected switches were upgraded successfully. You can see the results in the Switches list as well.
Sample Failed NetQ Agent Upgrade
This example shows that an error has occurred trying to upgrade two of the four switches in a job. The error indicates that the access permissions for the switches are invalid. In this case, you need to modify the switch access credentials and then create a new upgrade job.
If you were watching this job from the LCM dashboard view, click View on the NetQ Install and Upgrade History card to return to the detailed view to resolve any issues that occurred.
To view the progress of upgrade jobs, run:
netq lcm show upgrade-jobs netq-image [json]
netq lcm show status <text-lcm-job-id> [json]
You can view the progress of one upgrade job at a time. To do so, you first need the job identifier and then you can view the status of that job.
This example shows all upgrade jobs that are currently running or have completed, and then shows the status of the job with a job identifier of job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7.
cumulus@switch:~$ netq lcm show upgrade-jobs netq-image json
[
{
"jobId": "job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7",
"name": "Leaf01-02 to NetQ330",
"netqVersion": "4.1.0",
"overallStatus": "FAILED",
"pre-checkStatus": "COMPLETED",
"warnings": [],
"errors": [],
"startTime": 1611863290557.0
}
]
cumulus@switch:~$ netq lcm show status netq-image job_netq_install_7152a03a8c63c906631c3fb340d8f51e70c3ab508d69f3fdf5032eebad118cc7
NetQ Upgrade FAILED
Upgrade Summary
---------------
Start Time: 2021-01-28 19:48:10.557000
End Time: 2021-01-28 19:48:17.972000
Upgrade CLI: True
NetQ Version: 4.1.0
Pre Check Status COMPLETED
Precheck Task switch_precheck COMPLETED
Warnings: []
Errors: []
Precheck Task version_precheck COMPLETED
Warnings: []
Errors: []
Precheck Task config_precheck COMPLETED
Warnings: []
Errors: []
Hostname CL Version NetQ Version Prev NetQ Ver Config Profile Status Warnings Errors Start Time
sion
----------------- ----------- ------------- ------------- ---------------------------- ---------------- ---------------- ------------ --------------------------
leaf01 4.2.1 4.1.0 3.2.1 ['NetQ default config'] FAILED [] ["Unreachabl Thu Jan 28 19:48:10 2021
e at Invalid
/incorrect u
sername/pass
word. Skippi
ng remaining
10 retries t
o prevent ac
count lockou
t: Warning:
Permanently
added '192.1
68.200.11' (
ECDSA) to th
e list of kn
own hosts.\r
\nPermission
denied,
please try a
gain."]
leaf02 4.2.1 4.1.0 3.2.1 ['NetQ default config'] FAILED [] ["Unreachabl Thu Jan 28 19:48:10 2021
e at Invalid
/incorrect u
sername/pass
word. Skippi
ng remaining
10 retries t
o prevent ac
count lockou
t: Warning:
Permanently
added '192.1
68.200.12' (
ECDSA) to th
e list of kn
own hosts.\r
\nPermission
denied,
please try a
gain."]
Reasons for NetQ Agent Upgrade Failure
Upgrades can fail at any of the stages of the process, including when backing up data, upgrading the NetQ software, and restoring the data. Failures can also occur when attempting to connect to a switch or perform a particular task on the switch.
Some of the common reasons for upgrade failures and the errors they present:
Reason
Error Message
Switch is not reachable via SSH
Data could not be sent to remote host “192.168.0.15.” Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
Switch is reachable, but user-provided credentials are invalid
Invalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
Upgrade task could not be run
Failure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
Upgrade task failed
Failed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
Retry failed after five attempts
FAILED In all retries to process the LCM Job
Upgrade Cumulus Linux Using LCM
LCM lets you upgrade Cumulus Linux on one or more switches in your network through the NetQ UI or the NetQ CLI. You can run up to five upgrade jobs simultaneously; however, a given switch can only appear in one running job at a time.
You can upgrade Cumulus Linux from:
3.7.12 to later versions of Cumulus Linux 3
3.7.12 or later to 4.2.0 or later versions of Cumulus Linux 4
4.0 to later versions of Cumulus Linux 4
4.4.0 or later to Cumulus Linux 5.0 releases
5.0.0 or later to Cumulus Linux 5.1 or 5.2 releases
When upgrading to Cumulus Linux 5.0 or later, LCM backs up and restores flat file configurations in Cumulus Linux. After you upgrade to Cumulus Linux 5, running NVUE configuration commands replaces any configuration restored by NetQ LCM. See Upgrading Cumulus Linux for additional information.
LCM does not support Cumulus Linux upgrades when NVUE is enabled.
Workflows for Cumulus Linux Upgrades Using LCM
Three methods are available through LCM for upgrading Cumulus Linux on your switches based on whether the NetQ Agent is already installed on the switch or not, and whether you want to use the NetQ UI or the NetQ CLI:
Use NetQ UI or NetQ CLI for switches with NetQ Agent already installed
Use NetQ UI for switches without NetQ Agent installed
The workflows vary slightly with each approach:
Using the NetQ UI for switches with NetQ Agent installed, the workflow is:
Using the NetQ CLI for switches with NetQ Agent installed, the workflow is:
Using the NetQ UI for switches without NetQ Agent installed, the workflow is:
Upgrade Cumulus Linux on Switches with NetQ Agent Installed
You can upgrade Cumulus Linux on switches that already have a NetQ Agent installed using either the NetQ UI or NetQ CLI.
Prepare for Upgrade
Click (Devices) in any workbench header, then click Manage switches.
Assign a role to each switch (optional, but recommended).
Your LCM dashboard should look similar to this after you have completed these steps:
Create a discovery job to locate Cumulus Linux switches on the network. Use the netq lcm discover command, specifying a single IP address, a range of IP addresses where your switches are located in the network, or a CSV file containing the IP address, and optionally, the hostname and port for each switch on the network. If the port is blank, NetQ uses switch port 22 by default. They can be in any order you like, but the data must match that order.
cumulus@switch:~$ netq lcm discover ip-range 10.0.1.12
NetQ Discovery Started with job id: job_scan_4f3873b0-5526-11eb-97a2-5b3ed2e556db
Assign a role to each switch (optional, but recommended).
Perform a Cumulus Linux Upgrade
Upgrade Cumulus Linux on switches through either the NetQ UI or NetQ CLI:
Click (Devices) in any workbench header, then select Manage switches.
Click Manage on the Switches card.
Select the individual switches (or click to select all switches) that you want to upgrade. If needed, use the filter to the narrow the listing and find the relevant switches.
Click (Upgrade CL) above the table.
From this point forward, the software walks you through the upgrade process, beginning with a review of the switches that you selected for upgrade.
Give the upgrade job a name. This is required, but can be no more than 22 characters, including spaces and special characters.
Verify that the switches you selected are included, and that they have the correct IP address and roles assigned.
If you accidentally included a switch that you do NOT want to upgrade, hover over the switch information card and click to remove it from the upgrade job.
If the role is incorrect or missing, click , then select a role for that switch from the dropdown. Click to discard a role change.
When you are satisfied that the list of switches is accurate for the job, click Next.
Verify that you want to use the default Cumulus Linux or NetQ version for this upgrade job. If not, click Custom and select an alternate image from the list.
Default CL Version Selected
Custom CL Version Selected
Note that the switch access authentication method, Using global access credentials, indicates you have chosen either basic authentication with a username and password or SSH key-based authentication for all of your switches. Authentication on a per switch basis is not currently available.
Click Next.
Verify the upgrade job options.
By default, NetQ takes a network snapshot before the upgrade and then one after the upgrade is complete. It also performs a roll back to the original Cumulus Linux version on any server which fails to upgrade.
You can exclude selected services and protocols from the snapshots. By default, node and services are included, but you can deselect any of the other items. Click on one to remove it; click again to include it. This is helpful when you are not running a particular protocol or you have concerns about the amount of time it will take to run the snapshot. Note that removing services or protocols from the job might produce non-equivalent results compared with prior snapshots.
While these options provide a smoother upgrade process and are highly recommended, you have the option to disable these options by clicking No next to one or both options.
Click Next.
After the pre-checks have completed successfully, click Preview. If there are failures, refer to Precheck Failures.
These checks verify the following:
Selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ Agent upgrade
Selected versions of Cumulus Linux and NetQ Agent are valid upgrade paths
All mandatory parameters have valid values, including MLAG configurations
All switches are reachable
The order to upgrade the switches, based on roles and configurations
Review the job preview.
When all of your switches have roles assigned, this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), the order in which the switches are planned for upgrade (center; upgrade starts from the left), and the post-upgrade tasks status (right).
Roles assigned
When none of your switches have roles assigned or they are all of the same role, this view displays the chosen job options (top center), the pre-checks status (top right and left in Pre-Upgrade Tasks), a list of switches planned for upgrade (center), and the post-upgrade tasks status (right).
All roles the same
When some of your switches have roles assigned, any switches without roles get upgraded last and get grouped under the label Stage1.
Some roles assigned
When you are happy with the job specifications, click Start Upgrade.
Click Yes to confirm that you want to continue with the upgrade, or click Cancel to discard the upgrade job.
Perform the upgrade using the netq lcm upgrade cl-image command, providing a name for the upgrade job, the Cumulus Linux and NetQ version, and a comma-separated list of the hostname(s) to be upgraded:
You can also generate a network snapshot before and after the upgrade by adding the run-snapshot-before-after option to the command:
cumulus@switch:~$ netq lcm upgrade cl-image name upgrade-430 cl-version 4.3.0 netq-version 4.0.0 hostnames spine01,spine02,leaf01,leaf02 order spine,leaf run-snapshot-before-after
Restore on an Upgrade Failure
You can have LCM restore the previous version of Cumulus Linux if the upgrade job fails by adding the run-restore-on-failure option to the command. This is highly recommended.
cumulus@switch:~$ netq lcm upgrade cl-image name upgrade-430 cl-version 4.3.0 netq-version 4.0.0 hostnames spine01,spine02,leaf01,leaf02 order spine,leaf run-restore-on-failure
Precheck Failures
If one or more of the pre-checks fail, resolve the related issue and start the upgrade again. In the NetQ UI these failures appear on the Upgrade Preview page. In the NetQ CLI, it appears in the form of error messages in the netq lcm show upgrade-jobs cl-image command output.
Expand the following dropdown to view common failures, their causes and corrective actions.
▼
Precheck Failure Messages
Pre-check
Message
Type
Description
Corrective Action
(1) Switch Order
<hostname1> switch cannot be upgraded without isolating <hostname2>, <hostname3> which are connected neighbors. Unable to upgrade
Warning
Switches hostname2 and hostname3 get isolated during an upgrade, making them unreachable. These switches are skipped if you continue with the upgrade.
Reconfigure hostname2 and hostname3 to have redundant connections, or continue with upgrade knowing that connectivity is lost with these switches during the upgrade process.
(2) Version Compatibility
Unable to upgrade <hostname> with CL version <#> to <#>
Error
LCM only supports the following Cumulus Linux upgrades:
3.7.12 to later versions of Cumulus Linux 3
3.7.12 or later to 4.2.0 or later versions of Cumulus Linux 4
4.0 to later versions of Cumulus Linux 4
4.4.0 or later to Cumulus Linux 5.0 releases
5.0.0 or later to Cumulus Linux 5.1 releases
Perform a fresh install of CL.
Image not uploaded for the combination: CL Version - <x.y.z>, Asic Vendor - <NVIDIA | Broadcom>, CPU Arch - <x86 | ARM >
Error
The specified Cumulus Linux image is not available in the LCM repository
Restoration image not uploaded for the combination: CL Version - <x.y.z>, Asic Vendor - <Mellanox | Broadcom>, CPU Arch - <x86 | ARM >
Error
The specified Cumulus Linux image needed to restore the switch back to its original version if the upgrade fails is not available in the LCM repository. This applies only when the “Roll back on upgrade failure” job option is selected.
LCM cannot upgrade a switch that is not in its inventory.
Verify you have the correct hostname or IP address for the switch.
Verify the switch has NetQ Agent 4.1.0 or later installed: click , then click Agents in the Network section, view Version column. Upgrade NetQ Agents if needed. Refer to Upgrade NetQ Agents.
Switch <hostname> is rotten. Cannot select for upgrade.
Error
LCM must be able to communicate with the switch to upgrade it.
Troubleshoot the connectivity issue and retry upgrade when the switch is fresh.
Total number of jobs <running jobs count> exceeded Max jobs supported 50
Error
LCM can support a total of 50 upgrade jobs running simultaneously.
Wait for the total number of simultaneous upgrade jobs to drop below 50.
Switch <hostname> is already being upgraded. Cannot initiate another upgrade.
Error
Switch is already a part of another running upgrade job.
Remove switch from current job or wait until the competing job has completed.
Backup failed in previous upgrade attempt for switch <hostname>.
Warning
LCM was unable to back up switch during a previously failed upgrade attempt.
You could back up the switch manually prior to upgrade if you want to restore the switch after upgrade. Refer to Back Up and Restore NetQ.
Restore failed in previous upgrade attempt for switch <hostname>.
Warning
LCM was unable to restore switch after a previously failed upgrade attempt.
One or more switches stopped responding to the MLAG checks.
MLAG configuration checks failed
Error
One or more switches failed the MLAG checks.
For switch <hostname>, the MLAG switch with Role: secondary and ClagSysmac: <MAC address> does not exist.
Error
Identified switch is the primary in an MLAG pair, but the defined secondary switch is not in NetQ inventory.
Verify the switch has NetQ Agent 4.1.0 or later installed: click , then click Agents in the Network section, view Version column. Upgrade NetQ Agent if needed. Refer to Upgrade NetQ Agents. Add the missing peer switch to NetQ inventory.
Analyze Results
After starting the upgrade you can monitor the progress of your upgrade job and the final results. While the views are different, essentially the same information is available from either the NetQ UI or the NetQ CLI.
You can track the progress of your upgrade job from the Preview page or the Upgrade History page of the NetQ UI.
From the preview page, a green circle with rotating arrows appears each step as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the Upgrade History page. The job started most recently appears at the bottom, and the data refreshes every minute.
If you get disconnected while the job is in progress, it might appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.
Several viewing options are available for monitoring the upgrade job.
Monitor the job with full details open on the Preview page:
Single role
Multiple roles and some without roles
Each switch goes through a number of steps. To view these steps, click Details and scroll down as needed. Click collapse the step detail. Click to close the detail popup.
Monitor the job with summary information only in the CL Upgrade History page. Open this view by clicking in the full details view:
This view is refreshed automatically. Click to view what stage the job is in.
Click to view the detailed view.
Monitor the job through the CL Upgrade History card in the Job History tab. Click twice to return to the LCM dashboard. As you perform more upgrades the graph displays the success and failure of each job.
Click View to return to the Upgrade History page as needed.
Sample Successful Upgrade
On successful completion, you can:
Compare the network snapshots taken before and after the upgrade.
Download details about the upgrade in the form of a JSON-formatted file, by clicking Download Report.
View the changes on the Switches card of the LCM dashboard.
Click , then Upgrade Switches.
In our example, all switches have been upgraded to Cumulus Linux 3.7.12.
Sample Failed Upgrade
If an upgrade job fails for any reason, you can view the associated error(s):
From the CL Upgrade History dashboard, find the job of interest.
Click .
Click .
Note in this example, all of the pre-upgrade tasks were successful, but backup failed on the spine switches.
To view what step in the upgrade process failed, click and scroll down. Click to close the step list.
To view details about the errors, either double-click the failed step or click Details and scroll down as needed. Click collapse the step detail. Click to close the detail popup.
To see the progress of current upgrade jobs and the history of previous upgrade jobs, run netq lcm show upgrade-jobs cl-image:
cumulus@switch:~$ netq lcm show upgrade-jobs cl-image
Job ID Name CL Version Pre-Check Status Warnings Errors Start Time
------------ --------------- -------------------- -------------------------------- ---------------- ------------ --------------------
job_cl_upgra Leafs upgr to C 4.2.0 COMPLETED Fri Sep 25 17:16:10
de_ff9c35bc4 L410 2020
950e92cf49ac
bb7eb4fc6e3b
7feca7d82960
570548454c50
cd05802
job_cl_upgra Spines to 4.2.0 4.2.0 COMPLETED Fri Sep 25 16:37:08
de_9b60d3a1f 2020
dd3987f787c7
69fd92f2eef1
c33f56707f65
4a5dfc82e633
dc3b860
job_upgrade_ 3.7.12 Upgrade 3.7.12 WARNING Fri Apr 24 20:27:47
fda24660-866 2020
9-11ea-bda5-
ad48ae2cfafb
job_upgrade_ DataCenter 3.7.12 WARNING Mon Apr 27 17:44:36
81749650-88a 2020
e-11ea-bda5-
ad48ae2cfafb
job_upgrade_ Upgrade to CL3. 3.7.12 COMPLETED Fri Apr 24 17:56:59
4564c160-865 7.12 2020
3-11ea-bda5-
ad48ae2cfafb
To see details of a particular upgrade job, run netq lcm show status job-ID:
cumulus@switch:~$ netq lcm show status job_upgrade_fda24660-8669-11ea-bda5-ad48ae2cfafb
Hostname CL Version Backup Status Backup Start Time Restore Status Restore Start Time Upgrade Status Upgrade Start Time
---------- ------------ --------------- ------------------------ ---------------- ------------------------ ---------------- ------------------------
spine02 4.1.0 FAILED Fri Sep 25 16:37:40 2020 SKIPPED_ON_FAILURE N/A SKIPPED_ON_FAILURE N/A
spine03 4.1.0 FAILED Fri Sep 25 16:37:40 2020 SKIPPED_ON_FAILURE N/A SKIPPED_ON_FAILURE N/A
spine04 4.1.0 FAILED Fri Sep 25 16:37:40 2020 SKIPPED_ON_FAILURE N/A SKIPPED_ON_FAILURE N/A
spine01 4.1.0 FAILED Fri Sep 25 16:40:26 2020 SKIPPED_ON_FAILURE N/A SKIPPED_ON_FAILURE N/A
To see only Cumulus Linux upgrade jobs, run netq lcm show status cl-image job-ID.
Postcheck Failures
A successful upgrade can still have post-check warnings. For example, you updated the OS, but not all services are fully up and running after the upgrade. If one or more of the post-checks fail, warning messages appear in the Post-Upgrade Tasks section of the preview. Click the warning category to view the detailed messages.
Expand the following dropdown to view common failures, their causes and corrective actions.
▼
Post-check Failure Messages
Post-check
Message
Type
Description
Corrective Action
Health of Services
Service <service-name> is missing on Host <hostname> for <VRF default|VRF mgmt>.
Warning
A given service is not yet running on the upgraded host. For example: Service ntp is missing on Host Leaf01 for VRF default.
Wait for up to x more minutes to see if the specified services come up.
Switch Connectivity
Service <service-name> is missing on Host <hostname> for <VRF default|VRF mgmt>.
Warning
A given service is not yet running on the upgraded host. For example: Service ntp is missing on Host Leaf01 for VRF default.
Wait for up to x more minutes to see if the specified services come up.
Reasons for Upgrade Job Failure
Upgrades can fail at any of the stages of the process, including when backing up data, upgrading the Cumulus Linux software, and restoring the data. Failures can occur when attempting to connect to a switch or perform a particular task on the switch.
Some of the common reasons for upgrade failures and the errors they present:
Reason
Error Message
Switch is not reachable via SSH
Data could not be sent to remote host “192.168.0.15.” Make sure this host can be reached over ssh: ssh: connect to host 192.168.0.15 port 22: No route to host
Switch is reachable, but user-provided credentials are invalid
Invalid/incorrect username/password. Skipping remaining 2 retries to prevent account lockout: Warning: Permanently added ‘<hostname-ipaddr>’ to the list of known hosts. Permission denied, please try again.
Upgrade task could not be run
Failure message depends on the why the task could not be run. For example: /etc/network/interfaces: No such file or directory
Upgrade task failed
Failed at- <task that failed>. For example: Failed at- MLAG check for the peerLink interface status
Retry failed after five attempts
FAILED In all retries to process the LCM Job
Upgrade Cumulus Linux on Switches Without NetQ Agent Installed
When you want to update Cumulus Linux on switches without NetQ installed, NetQ provides the LCM switch discovery feature. The feature browses your network to find all Cumulus Linux switches, with and without NetQ currently installed and determines the versions of Cumulus Linux and NetQ installed. The results of switch discovery are then used to install or upgrade Cumulus Linux and NetQ on all discovered switches in a single procedure rather than in two steps. You can run up to five jobs simultaneously; however, a given switch can only appear in one running job at a time.
If all your Cumulus Linux switches already have NetQ 2.4.x or later installed, you can upgrade them directly. Refer to Upgrade Cumulus Linux.
To discover switches running Cumulus Linux and upgrade Cumulus Linux and NetQ on them:
Click (Main Menu) and select Upgrade Switches, or click (Switches) in the workbench header, then click Manage switches.
On the Switches card, click Discover.
Enter a name for the scan.
Choose whether you want to look for switches by entering IP address ranges OR import switches using a comma-separated values (CSV) file.
If you do not have a switch listing, then you can manually add the address ranges where your switches are located in the network. This has the advantage of catching switches that might have been missed in a file.
A maximum of 50 addresses can be included in an address range. If necessary, break the range into smaller ranges.
To discover switches using address ranges:
Enter an IP address range in the IP Range field.
Ranges can be contiguous, for example 192.168.0.24-64, or non-contiguous, for example 192.168.0.24-64,128-190,235, but they must be contained within a single subnet.
Optionally, enter another IP address range (in a different subnet) by clicking .
For example, 198.51.100.0-128 or 198.51.100.0-128,190,200-253.
Add additional ranges as needed. Click to remove a range if needed.
If you decide to use a CSV file instead, the ranges you entered will remain if you return to using IP ranges again.
If you have a file of switches that you want to import, then it can be easier to use that, than to enter the IP address ranges manually.
To import switches through a CSV file:
Click Browse.
Select the CSV file containing the list of switches.
The CSV file must include a header containing hostname, ip, and port. They can be in any order you like, but the data must match that order. For example, a CSV file that represents the Cumulus reference topology could look like this:
or this:
You must have an IP address in your file, but the hostname is optional and if the port is blank, NetQ uses switch port 22 by default.
Click Remove if you decide to use a different file or want to use IP address ranges instead. If you entered ranges before selecting the CSV file option, they remain.
Note that you can use the switch access credentials defined in Switch Credentials to access these switches. If you have issues accessing the switches, you might need to update your credentials.
Click Next.
When the network discovery is complete, NetQ presents the number of Cumulus Linux switches it found. Each switch can be in one of the following categories:
Discovered without NetQ: Switches found without NetQ installed
Discovered with NetQ: Switches found with some version of NetQ installed
Discovered but Rotten: Switches found that are unreachable
Incorrect Credentials: Switches found that cannot are unreachable because the provided access credentials do not match those for the switches
OS not Supported: Switches found that are running Cumulus Linux version not supported by the LCM upgrade feature
Not Discovered: IP addresses which did not have an associated Cumulus Linux switch
If the discovery process does not find any switches for a particular category, then it does not display that category.
Select which switches you want to upgrade from each category by clicking the checkbox on each switch card.
Click Next.
Verify the number of switches identified for upgrade and the configuration profile to be applied is correct.
Accept the default NetQ version or click Custom and select an alternate version.
By default, the NetQ Agent and CLI are upgraded on the selected switches. If you do not want to upgrade the NetQ CLI, click Advanced and change the selection to No.
Click Next.
Several checks are performed to eliminate preventable problems during the install process.
These checks verify the following:
Selected switches are not currently scheduled for, or in the middle of, a Cumulus Linux or NetQ Agent upgrade
Selected versions of Cumulus Linux and NetQ Agent are valid upgrade paths
All mandatory parameters have valid values, including MLAG configurations
All switches are reachable
The order to upgrade the switches, based on roles and configurations
If any of the pre-checks fail, review the error messages and take appropriate action.
If all of the pre-checks pass, click Install to initiate the job.
Monitor the job progress.
After starting the upgrade you can monitor the progress from the preview page or the Upgrade History page.
From the preview page, a green circle with rotating arrows is shown on each switch as it is working. Alternately, you can close the detail of the job and see a summary of all current and past upgrade jobs on the NetQ Install and Upgrade History page. The job started most recently is shown at the top, and the data is refreshed periodically.
If you are disconnected while the job is in progress, it might appear as if nothing is happening. Try closing (click ) and reopening your view (click ), or refreshing the page.
Several viewing options are available for monitoring the upgrade job.
Monitor the job with full details open:
Monitor the job with only summary information in the NetQ Install and Upgrade History page. Open this view by clicking in the full details view; useful when you have multiple jobs running simultaneously
Monitor the job through the NetQ Install and Upgrade History card on the LCM dashboard. Click twice to return to the LCM dashboard.
Investigate any failures and create new jobs to reattempt the upgrade.
If you previously ran a discovery job, as described above, you can show the results of that job by running the netq lcm show discovery-job command.
cumulus@switch:~$ netq lcm show discovery-job job_scan_921f0a40-5440-11eb-97a2-5b3ed2e556db
Scan COMPLETED
Summary
-------
Start Time: 2021-01-11 19:09:47.441000
End Time: 2021-01-11 19:09:59.890000
Total IPs: 1
Completed IPs: 1
Discovered without NetQ: 0
Discovered with NetQ: 0
Incorrect Credentials: 0
OS Not Supported: 0
Not Discovered: 1
Hostname IP Address MAC Address CPU CL Version NetQ Version Config Profile Discovery Status Upgrade Status
----------------- ------------------------- ------------------ -------- ----------- ------------- ---------------------------- ---------------- --------------
N/A 10.0.1.12 N/A N/A N/A N/A [] NOT_FOUND NOT_UPGRADING
cumulus@switch:~$
When the network discovery is complete, NetQ presents the number of Cumulus Linux switches it has found. The output displays their discovery status, which can be one of the following:
Discovered without NetQ: Switches found without NetQ installed
Discovered with NetQ: Switches found with some version of NetQ installed
Discovered but Rotten: Switches found that are unreachable
Incorrect Credentials: Switches found that are unreachable because the provided access credentials do not match those for the switches
OS not Supported: Switches found that are running Cumulus Linux version not supported by the LCM upgrade feature
NOT_FOUND: IP addresses which did not have an associated Cumulus Linux switch
After you determine which switches you need to upgrade, run the upgrade process as described above.
Network Snapshots
Creating and comparing network snapshots can be useful to validate that the network state has not changed. Snapshots are typically created when you upgrade or change the configuration of your switches in some way. This section describes the Snapshot card and content, as well as how to create and compare network snapshots at any time. Snapshots can be automatically created during the upgrade process for Cumulus Linux or SONiC. Refer to Perform a Cumulus Linux Upgrade.
Create a Network Snapshot
To create a snapshot:
From any workbench in the NetQ UI, click in the workbench header.
Click Create Snapshot.
Enter a name for the snapshot.
Choose the time for the snapshot:
For the current network state, click Now.
For the network state at a previous date and time, click Past, then click in Start Time field to use the calendar to step through selection of the date and time. You might need to scroll down to see the entire calendar.
Choose the services to include in the snapshot.
In the Choose options field, click any service name to remove that service from the snapshot. This would be appropriate if you do not support a particular service, or you are concerned that including that service might cause the snapshot to take an excessive amount of time to complete if included. The checkmark next to the service and the service itself is grayed out when the service is removed. Click any service again to re-include the service in the snapshot. The checkmark is highlighted in green next to the service name and is no longer grayed out.
The Node and Services options are mandatory, and cannot be selected or unselected.
If you remove services, be aware that snapshots taken in the past or future might not be equivalent when performing a network state comparison.
This example removes the OSPF and Route services from the snapshot being created.
Optionally, scroll down and click in the Notes field to add descriptive text for the snapshot to remind you of its purpose. For example: “This was taken before adding MLAG pairs,” or “Taken after removing the leaf36 switch.”
Click Finish.
A medium Snapshot card appears on your desktop. Spinning arrows are visible while it works. When it finishes you can see the number of items that have been captured, and if any failed. This example shows a successful result.
If you have already created other snapshots, Compare is active. Otherwise it is inactive (grayed-out).
When you are finished viewing the snapshot, click Dismiss to close the snapshot. The snapshot is not deleted, merely removed from the workbench.
Compare Network Snapshots
You can compare the state of your network before and after an upgrade or other configuration change to validate that the changes have not created an unwanted change in your network state.
To compare network snapshots:
Create a snapshot (as described in previous section) before you make any changes.
Make your changes.
Create a second snapshot.
Compare the results of the two snapshots.
Depending on what, if any, cards are open on your workbench:
Put the cards next to each other to view a high-level comparison. Scroll down to see all of the items.
To view a more detailed comparison, click Compare on one of the cards. Select the other snapshot from the list.
Click Compare on the open card.
Select the other snapshot to compare.
Click .
Click Compare Snapshots.
Click on the two snapshots you want to compare.
Click Finish. Note that two snapshots must be selected before Finish is active.
In the latter two cases, the large Snapshot card opens. The only difference is in the card title. If you opened the comparison card from a snapshot on your workbench, the title includes the name of that card. If you open the comparison card through the Snapshot menu, the title is generic, indicating a comparison only. Functionally, you have reached the same point.
Scroll down to view all element comparisons.
Interpreting the Comparison Data
For each network element that is compared, count values and changes are shown:
In this example, there are changes to the MAC addresses and neighbors. The snapshot taken before the change (19JanGold) had a total count of 316 MAC addresses and 873 neighbors. The snapshot taken after the changes (Now) has a total count of 320 MAC addresses and 891 neighbors. Between the two totals you can see the number of neighbors added, updated, and removed from one time to the next. This shows four MAC addresses have been added, 9 MAC addresses have been updated, and 18 neighbors have been added.
The coloring does not indicate whether the additional, removal, or update of items is bad or good. It only indicates that a change has occurred.
Be aware that depending on the display order of the snapshots determines what is considered added or removed. Compare these two views of the same data.
More recent snapshot on right
More recent snapshot on left
You can also change which snapshots to compare. Select an alternate snapshot from one or both of the two snapshot dropdowns and then click Compare.
View Change Details
You can view additional details about the changes that have occurred between the two snapshots by clicking View Details. This opens the full screen Detailed Snapshot Comparison card.
From this card you can:
View changes for each of the elements that had added, updated, and removed items, and various information about each; only elements with changes are presented
Filter the added and removed items by clicking
Export all differences in JSON file format by clicking
The following table describes the information provided for each element type when changes are present:
Element
Data Descriptions
BGP
Hostname: Name of the host running the BGP session
VRF: Virtual route forwarding interface if used
BGP Session: Session that was removed or added
ASN: Autonomous system number
CLAG
Hostname: Name of the host running the CLAG session
CLAG Sysmac: MAC address for a bond interface pair that was removed or added
Interface
Hostname: Name of the host where the interface resides
IF Name: Name of the interface that was removed or added
IP Address
Hostname: Name of the host where address was removed or added
Prefix: IP address prefix
Mask: IP address mask
IF Name: Name of the interface that owns the address
Links
Hostname: Name of the host where the link was removed or added
You can decommission a switch or host from the NetQ UI using the Inventory | Devices card. This stops and disables the NetQ Agent service on the device, and decommissions it from the NetQ database.
Expand the Inventory | Devices card to list the devices in the current inventory:
This page tells you how to view the status of an agent, disable an agent, manage NetQ Agent logging, and configure the events the agent collects.
View NetQ Agent Status
To view the health of your NetQ Agents, run:
netq [<hostname>] show agents [fresh | dead | rotten | opta] [around <text-time>] [json]
You can view the status for a given switch, host or NetQ Appliance or Virtual Machine. You can also filter by the status and view the status at a time in the past.
To view NetQ Agents that are not communicating, run:
cumulus@switch~:$ netq show agents rotten
No matching agents records found
To view NetQ Agent status on the NetQ appliance or VM, run:
cumulus@switch~:$ netq show agents opta
Matching agents records:
Hostname Status NTP Sync Version Sys Uptime Agent Uptime Reinitialize Time Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
netq-ts Fresh yes 3.2.0-ub18.04u30~1601393774.104fb9e Mon Sep 21 16:46:53 2020 Tue Sep 29 21:13:07 2020 Tue Sep 29 21:13:07 2020 Thu Oct 1 16:29:51 2020
View NetQ Agent Configuration
You can view the current configuration of a NetQ Agent to determine what data it collects and where it sends that data. To view this configuration, run:
netq config show agent [kubernetes-monitor|loglevel|stats|sensors|frr-monitor|wjh|wjh-threshold|cpu-limit] [json]
This example shows a NetQ Agent in an on-premises deployment, talking to an appliance or VM at 127.0.0.1 using the default ports and VRF. There is no special configuration to monitor Kubernetes, FRR, interface statistics, sensors, or WJH, and there are no limits on CPU usage or change to the default logging level.
cumulus@switch:~$ netq config show agent
netq-agent value default
--------------------- --------- ---------
exhibitport
exhibiturl
server 127.0.0.1 127.0.0.1
cpu-limit 100 100
agenturl
enable-opta-discovery True True
agentport 8981 8981
port 31980 31980
vrf default default
()
To view the configuration of a particular aspect of a NetQ Agent, use the various options.
This example show a NetQ Agent configured with a CPU limit of 60%.
cumulus@switch:~$ netq config show agent cpu-limit
CPU Quota
-----------
60%
()
Modify the Configuration of the NetQ Agent on a Node
The agent configuration commands enable you to do the following:
Add, Disable, and Remove a NetQ Agent
Start and Stop a NetQ Agent
Configure a NetQ Agent to Collect Selected Data (CPU usage limit, FRR, Kubernetes, sensors, WJH)
Configure a NetQ Agent to Send Data to a Server Cluster
Troubleshoot the NetQ Agent
Commands apply to one agent at a time, and you run them on the switch or host where the NetQ Agent resides.
Add and Remove a NetQ Agent
Adding or removing a NetQ Agent is to add or remove the IP address (and port and VRF when specified) from NetQ configuration file (at /etc/netq/netq.yml). This adds or removes the information about the appliance or VM where the agent sends the data it collects.
To use the NetQ CLI to add or remove a NetQ Agent on a switch or host, run:
netq config add agent server <text-opta-ip> [port <text-opta-port>] [vrf <text-vrf-name>]
netq config del agent server
If you want to use a specific port on the appliance or VM, use the port option. If you want the data sent over a particular virtual route interface, use the vrf option.
This example shows how to add a NetQ Agent and tell it to send the data it collects to the NetQ Appliance or VM at the IPv4 address of 10.0.0.23 using the default port (on-premises = 31980; cloud = 443) and vrf (default).
You can temporarily disable the NetQ Agent on a node. Disabling the NetQ Agent maintains the data already collected in the NetQ database, but stops the NetQ Agent from collecting new data until you reenable it.
To disable a NetQ Agent, run:
cumulus@switch:~$ netq config stop agent
To reenable a NetQ Agent, run:
cumulus@switch:~$ netq config restart agent
Configure a NetQ Agent to Limit Switch CPU Usage
While not typically an issue, you can restrict the NetQ Agent from using more than a configurable amount of the CPU resources. This setting requires Cumulus Linux versions 3.6.x, 3.7.x or 4.1.0 or later to be running on the switch.
You must separate the list of IP addresses by commas, but no spaces. You can optionally specify a port or VRF.
This example configures the NetQ Agent on a switch to send the data to three servers located at 10.0.0.21, 10.0.0.22, and 10.0.0.23 using the rocket VRF.
To stop a NetQ Agent from sending data to a server cluster, run:
cumulus@switch:~$ netq config del agent cluster-servers
Configure Logging to Troubleshoot a NetQ Agent
The logging level used for a NetQ Agent determines what types of events get logged about the NetQ Agent on the switch or host.
First, you need to decide what level of logging you want to configure. You can configure the logging level to be the same for every NetQ Agent, or selectively increase or decrease the logging level for a NetQ Agent on a problematic node.
Logging Level
Description
debug
Sends notifications for all debugging-related, informational, warning, and error messages.
info
Sends notifications for informational, warning, and error messages (default).
warning
Sends notifications for warning and error messages.
error
Sends notifications for errors messages.
You can view the NetQ Agent log directly. Messages have the following structure:
If you set the logging level to debug for troubleshooting, NVIDIA recommends that you either change the logging level to a less heavy mode or completely disable agent logging altogether when you finish troubleshooting.
To change the logging level from debug to another level, run:
The NetQ Agent contains a pre-configured set of modular commands that run periodically and send event and resource data to the NetQ appliance or VM. You can fine tune which events the agent can poll and vary frequency of polling using the NetQ CLI.
For example, if your network is not running OSPF, you can disable the command that polls for OSPF events. Or you can decrease the polling interval for LLDP from the default of 60 seconds to 120 seconds. By not polling for selected data or polling less frequently, you can reduce switch CPU usage by the NetQ Agent.
Depending on the switch platform, the NetQ Agent might not execute some supported protocol commands. For example, if a switch has no VXLAN capability, then the agent skips all VXLAN-related commands.
You cannot create new commands in this release.
Supported Commands
To see the list of supported modular commands, run:
agent_stats: Collects statistics about the NetQ Agent every five (5) minutes.
agent_util_stats: Collects switch CPU and memory utilization by the NetQ Agent every 30 seconds.
cl-support-json: Polls the switch every three (3) minutes to determine if an agent generated a cl-support file.
config-mon-json: Polls the /etc/network/interfaces, /etc/frr/frr.conf, /etc/lldpd.d/README.conf and /etc/ptm.d/topology.dot files every two (2) minutes to determine if the contents of any of these files has changed. If a change occurred, the agent transmits the contents of the file and its modification time to the NetQ appliance or VM.
ports: Polls for optics plugged into the switch every hour.
proc-net-dev: Polls for network statistics on the switch every 30 seconds.
running-config-mon-json: Polls the clagctl parameters every 30 seconds and sends a diff of any changes to the NetQ appliance or VM.
Modify the Polling Frequency
You can change the polling frequency (in seconds) of a modular command. For example, to change the polling frequency of the lldp-json command to 60 seconds from its default of 120 seconds, run:
You can disable any of these commands if they are not needed on your network. This can help reduce the compute resources the NetQ Agent consumes on the switch. For example, if your network does not run OSPF, you can disable the two OSPF commands:
This topic describes how to use the NetQ UI and CLI to monitor your inventory from networkwide and device-specific perspectives.
You can monitor all hardware and software components installed and running on the switches and hosts across the entire network. This is very useful for understanding the dependence on various vendors and versions, when planning upgrades or the scope of any other required changes.
From a networkwide view, you can monitor all switches and hosts at one time, or you can monitor all switches at one time. You cannot currently monitor all hosts at one time separate from switches.
Networkwide Inventory
With the NetQ UI and CLI, a user can monitor the inventory on a networkwide basis for all switches and hosts, or all switches. Inventory includes such items as the number of each device and its operating system. Additional details are available about the hardware and software components on individual switches, such as the motherboard, ASIC, microprocessor, disk, memory, fan and power supply information. This is extremely useful for understanding the dependence on various vendors and versions when planning upgrades or evaluating the scope of any other required changes.
The commands and cards available to obtain this type of information help you to answer questions such as:
What switches are being monitored in the network?
What is the distribution of ASICs, CPUs, Agents, and so forth across my network?
The NetQ UI provides the Inventory|Devices card for monitoring networkwide inventory information for all switches, hosts and DPUs. Individual device summary cards provide a more detailed view of inventory information for all switches, hosts, and DPUs on a networkwide basis.
Access these card from the NetQ Workbench, or add them to your own workbench by clicking (Add card) > Inventory > Inventory|Devices card, or Inventory|Switches card, Inventory|Hosts card, or Inventory|DPUs card > Open Cards.
The NetQ CLI provides detailed network inventory information through its netq show inventory command.
View Networkwide Inventory Summary
You can view all devices in your network from either the NetQ UI or NetQ CLI.
View the Number of Each Device Type in Your Network
You can view the number of switches and hosts deployed in your network. As you grow your network this can be useful for validating the addition of devices as scheduled.
To view the quantity of devices in your network, locate or open the small or medium Inventory|Devices card. The medium-sized card provides operating system distribution across the network in addition to the device count. Hover over items in the chart’s outer circle to view operating system distribution, and hover over items in the chart’s inner circle to view device counts.
View All Switches
You can view all stored attributes for all switches in your network from either inventory card:
Open the full-screen Inventory|Devices card and click All Switches
Open the full-screen Inventory|Switches card and click Show All
To return to your workbench, click in the top right corner of the card.
View All Hosts
You can view all stored attributes for all hosts in your network. To view all host details, open the full screen Inventory|Devices card and click All Hosts.
To return to your workbench, click in the top right corner of the card.
To view a list of devices in your network, run:
netq show inventory brief [json]
This example shows that there are four spine switches, three leaf switches, two border switches, two firewall switches, seven hosts (servers), and an out-of-band management server in this network. For each of these you see the type of switch, operating system, CPU and ASIC.
You can view hardware components deployed on all switches and hosts, or on all switches in your network.
View Components Summary
It can be useful to know the quantity and ratio of many components deployed in your network to determine the scope of upgrade tasks, balance vendor reliance, or for detailed troubleshooting. Hardware and software component summary information is available from the NetQ UI and NetQ CLI.
Inventory|Devices card: view ASIC, NetQ Agent version, OS, and platform information on all devices
Inventory|Switches card: view ASIC, CPU, disk, NetQ Agent version, OS, and platform information on all switches
netq show inventory command: view ASIC, CPU, disk, OS, and ports on all devices
Locate the Inventory|Devices card on your workbench.
Hover over the card, and change to the large size card using the size picker.
By default the Switches tab shows the total number of switches, ASIC vendors, OS versions, NetQ Agent versions, and specific platforms deployed across all your switches.
You can hover over any of the segments in a component distribution chart to highlight a specific type of the given component. When you hover, a tooltip appears displaying:
Name or value of the component type, such as the version number or status
Total number of switches with that type of component deployed compared to the total number of switches
Percentage of this type as compared to all component types
Additionally, sympathetic highlighting is used to show the related component types relevant to the highlighted segment and the number of unique component types associated with this type (shown in light gray here).
Locate the Inventory|Switches card on your workbench.
Select a specific component from the dropdown menu.
Hover over any of the segments in the distribution chart to highlight a specific component.
When you hover, a tooltip appears displaying:
Name or value of the component type, such as the version number or status
Total number of switches with that type of component deployed compared to the total number of switches
Percentage of this type with respect to all component types
Change to the large size card. The same information is shown separated by hardware and software, and sympathetic highlighting is used to show the related component types relevant to the highlighted segment and the number of unique component types associated with this type (shown in blue here).
Locate the Inventory|Hosts card on your workbench.
Select a specific component from the dropdown menu.
Hover over any of the segments in the distribution chart to highlight a specific component.
When you hover, a tooltip appears displaying:
Name or value of the component type, such as the version number or status
Total number of switches with that type of component deployed compared to the total number of switches
Percentage of this type with respect to all component types
Change to the large size card. The same information is shown separated by hardware and software, and sympathetic highlighting is used to show the related component types relevant to the highlighted segment and the number of unique component types associated with this type (shown in blue here).
Locate the Inventory|Switches card on your workbench.
Select a specific component from the dropdown menu.
Hover over any of the segments in the distribution chart to highlight a specific component.
When you hover, a tooltip appears displaying:
Name or value of the component type, such as the version number or status
Total number of switches with that type of component deployed compared to the total number of switches
Percentage of this type with respect to all component types
Change to the large size card. The same information is shown separated by hardware and software, and sympathetic highlighting is used to show the related component types relevant to the highlighted segment and the number of unique component types associated with this type (shown in blue here).
To view switch components, run:
netq show inventory brief [json]
This example shows the operating systems (Cumulus Linux and Ubuntu), CPU architecture (all x86_64), ASIC (virtual), and ports (N/A because Cumulus VX is virtual) for each device in the network. You can manually count the number of each of these, or export to a spreadsheet tool to sort and filter the list.
ASIC information is available from the NetQ UI and NetQ CLI.
Inventory|Devices card
Large: view ASIC distribution across all switches (graphic)
Full-screen: view ASIC vendor, model, model ID, ports, core bandwidth across all devices (table)
Inventory|Switches card
Medium/Large: view ASIC distribution across all switches (graphic)
Full-screen: view ASIC vendor, model, model ID, ports, core bandwidth and data across all switches (table)
netq show inventory asic command
View ASIC vendor, model, model ID, core bandwidth, and ports on all devices
Locate the medium Inventory|Devices card on your workbench.
Hover over the card, and change to the large size card using the size picker.
Click a segment of the ASIC graph in the component distribution charts.
Select the first option from the popup, Filter ASIC. The card data is filtered to show only the components associated with selected component type. A filter tag appears next to the total number of switches indicating the filter criteria.
Hover over the segments to view the related components.
To return to the full complement of components, click the in the filter tag.
Hover over the card, and change to the full-screen card using the size picker.
Scroll to the right to view the above ASIC information.
To return to your workbench, click in the top right corner of the card.
Locate the Inventory|Switches card on your workbench.
Hover over a segment of the ASIC graph in the distribution chart.
The same information is available on the summary tab of the large size card.
Hover over the card header and click to view the ASIC vendor and model distribution.
Hover over charts to view the name of the ASIC vendors or models, how many switches have that vendor or model deployed, and the percentage of this number compared to the total number of switches.
Change to the full-screen card to view all of the available ASIC information. Note that if you are running CumulusVX switches, no detailed ASIC information is available.
To return to your workbench, click in the top right corner of the card.
To view information about the ASIC installed on your devices, run:
netq show inventory asic [vendor <asic-vendor>|model <asic-model>|model-id <asic-model-id>] [json]
If you are running NetQ on a CumulusVX setup, there is no physical hardware to query and thus no ASIC information to display.
This example shows the ASIC information for all devices in your network:
cumulus@switch:~$ netq show inventory asic
Matching inventory records:
Hostname Vendor Model Model ID Core BW Ports
----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
dell-z9100-05 Broadcom Tomahawk BCM56960 2.0T 32 x 100G-QSFP28
mlx-2100-05 Mellanox Spectrum MT52132 N/A 16 x 100G-QSFP28
mlx-2410a1-05 Mellanox Spectrum MT52132 N/A 48 x 25G-SFP28 & 8 x 100G-QSFP28
mlx-2700-11 Mellanox Spectrum MT52132 N/A 32 x 100G-QSFP28
qct-ix1-08 Broadcom Tomahawk BCM56960 2.0T 32 x 100G-QSFP28
qct-ix7-04 Broadcom Trident3 BCM56870 N/A 32 x 100G-QSFP28
st1-l1 Broadcom Trident2 BCM56854 720G 48 x 10G-SFP+ & 6 x 40G-QSFP+
st1-l2 Broadcom Trident2 BCM56854 720G 48 x 10G-SFP+ & 6 x 40G-QSFP+
st1-l3 Broadcom Trident2 BCM56854 720G 48 x 10G-SFP+ & 6 x 40G-QSFP+
st1-s1 Broadcom Trident2 BCM56850 960G 32 x 40G-QSFP+
st1-s2 Broadcom Trident2 BCM56850 960G 32 x 40G-QSFP+
You can filter the results of the command to view devices with a particular vendor, model, or modelID. This example shows ASIC information for all devices with a vendor of NVIDIA.
cumulus@switch:~$ netq show inventory asic vendor NVIDIA
Matching inventory records:
Hostname Vendor Model Model ID Core BW Ports
----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
mlx-2100-05 NVIDIA Spectrum MT52132 N/A 16 x 100G-QSFP28
mlx-2410a1-05 NVIDIA Spectrum MT52132 N/A 48 x 25G-SFP28 & 8 x 100G-QSFP28
mlx-2700-11 NVIDIA Spectrum MT52132 N/A 32 x 100G-QSFP28
View Motherboard/Platform Information
Motherboard and platform information is available from the NetQ UI and NetQ CLI.
Inventory|Devices card
Full-screen: view platform vendor, model, manufacturing date, revision, serial number, MAC address, series across all devices (table)
Inventory|Switches card
Medium/Large: view platform distribution across on all switches (graphic)
Full-screen: view platform vendor, model, manufacturing date, revision, serial number, MAC address, series across all switches (table)
netq show inventory board command
View motherboard vendor, model, base MAC address, serial number, part number, revision, and manufacturing date on all devices
Locate the Inventory|Devices card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
The All Switches tab is active by default. Scroll to the right to view the various Platform parameters for your switches. Optionally drag and drop the relevant columns next to each other.
Click All Hosts.
Scroll to the right to view the various Platform parameters for your hosts. Optionally drag and drop the relevant columns next to each other.
To return to your workbench, click in the top right corner of the card.
Locate the Inventory|Switches card on your workbench.
Hover over the card, and change to the large card using the size picker.
Hover over the header and click .
Hover over a segment in the Vendor or Platform graphic to view how many switches deploy the specified vendor or platform.
Context sensitive highlighting is also employed here, such that when you select a vendor, the corresponding platforms are also highlighted; and vice versa.
Click either Show All link to open the full-screen card.
Click Platform.
To return to your workbench, click in the top right corner of the card.
To view a list of motherboards installed in your switches and hosts, run:
netq show inventory board [vendor <board-vendor>|model <board-model>] [json]
This example shows all motherboard data for all devices.
You can filter the results of the command to capture only those devices with a particular motherboard vendor or model. This example shows only the devices with a Celestica motherboard.
cumulus@switch:~$ netq show inventory board vendor celestica
Matching inventory records:
Hostname Vendor Model Base MAC Serial No Part No Rev Mfg Date
----------------- -------------------- ------------------------------ ------------------ ------------------------- ---------------- ------ ----------
st1-l1 CELESTICA Arctica 4806xp 00:E0:EC:27:71:37 D2060B2F044919GD000011 R0854-F1004-01 Redsto 09/20/2014
ne-XP
st1-l2 CELESTICA Arctica 4806xp 00:E0:EC:27:6B:3A D2060B2F044919GD000060 R0854-F1004-01 Redsto 09/20/2014
ne-XP
View CPU Information
CPU information is available from the NetQ UI and NetQ CLI.
Inventory|Devices card
Full-screen: view CPU architecture, model, maximum operating frequency, and the number of cores on all devices (table)
Inventory|Switches card
Medium/Large: view CPU distribution across on all switches (graphic)
Full-screen: view CPU architecture, model, maximum operating frequency, the number of cores, and data on all switches (table)
netq show inventory cpu command
View CPU architecture, model, maximum operating frequency, and the number of cores on all devices
Locate the Inventory|Devices card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
The All Switches tab is active by default. Scroll to the right to view the various CPU parameters. Optionally drag and drop relevant columns next to each other.
Click All Hosts to view the CPU information for your host servers.
To return to your workbench, click in the top right corner of the card.
Locate the Inventory|Switches card on your workbench.
Hover over a segment of the CPU graph in the distribution chart.
The same information is available on the summary tab of the large size card.
Hover over the card, and change to the full-screen card using the size picker.
Click CPU.
To return to your workbench, click in the top right corner of the card.
To view CPU information for all devices in your network, run:
netq show inventory cpu [arch <cpu-arch>] [json]
This example shows the CPU information for all devices.
You can filter the results of the command to view which switches employ a particular CPU architecture using the arch keyword. This example shows how to determine all the currently deployed architectures in your network, and then shows all devices with an x86_64 architecture.
You can filter the results of the command to view devices with a particular memory type or vendor. This example shows all the devices with memory from QEMU .
cumulus@switch:~$ netq show inventory memory vendor QEMU
Matching inventory records:
Hostname Name Type Size Speed Vendor Serial No
----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
leaf01 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
leaf02 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
leaf03 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
leaf04 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
oob-mgmt-server DIMM 0 RAM 4096 MB Unknown QEMU Not Specified
server01 DIMM 0 RAM 512 MB Unknown QEMU Not Specified
server02 DIMM 0 RAM 512 MB Unknown QEMU Not Specified
server03 DIMM 0 RAM 512 MB Unknown QEMU Not Specified
server04 DIMM 0 RAM 512 MB Unknown QEMU Not Specified
spine01 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
spine02 DIMM 0 RAM 1024 MB Unknown QEMU Not Specified
View Sensor Information
Fan, power supply unit (PSU), and temperature sensors are available to provide additional data about the NetQ system operation.
Sensor information is available from the NetQ UI and NetQ CLI.
PSU Sensor card: view sensor name, current/previous state, input/output power, and input/output voltage on all devices (table)
Fan Sensor card: view sensor name, description, current/maximum/minimum speed, and current/previous state on all devices (table)
Temperature Sensor card: view sensor name, description, minimum/maximum threshold, current/critical(maximum)/lower critical (minimum) threshold, and current/previous state on all devices (table)
netq show sensors: view sensor name, description, current state, and time when data was last changed on all devices for all or one sensor type
Power Supply Unit Information
Click (main menu), then click Sensors in the Network heading.
The PSU tab is displayed by default.
PSU Parameter
Description
Hostname
Name of the switch or host where the power supply is installed
Timestamp
Date and time the data was captured
Message Type
Type of sensor message; always PSU in this table
PIn(W)
Input power (Watts) for the PSU on the switch or host
POut(W)
Output power (Watts) for the PSU on the switch or host
Sensor Name
User-defined name for the PSU
Previous State
State of the PSU when data was captured in previous window
State
State of the PSU when data was last captured
VIn(V)
Input voltage (Volts) for the PSU on the switch or host
VOut(V)
Output voltage (Volts) for the PSU on the switch or host
To return to your workbench, click in the top right corner of the card.
Fan Information
Click (main menu), then click Sensors in the Network heading.
Click Fan.
Fan Parameter
Description
Hostname
Name of the switch or host where the fan is installed
Timestamp
Date and time the data was captured
Message Type
Type of sensor message; always Fan in this table
Description
User specified description of the fan
Speed (RPM)
Revolution rate of the fan (revolutions per minute)
Max
Maximum speed (RPM)
Min
Minimum speed (RPM)
Message
Message
Sensor Name
User-defined name for the fan
Previous State
State of the fan when data was captured in previous window
State
State of the fan when data was last captured
To return to your workbench, click in the top right corner of the card.
Temperature Information
Click (main menu), then click Sensors in the Network heading.
Click Temperature.
Temperature Parameter
Description
Hostname
Name of the switch or host where the temperature sensor is installed
Timestamp
Date and time the data was captured
Message Type
Type of sensor message; always Temp in this table
Critical
Current critical maximum temperature (°C) threshold setting
Description
User specified description of the temperature sensor
Lower Critical
Current critical minimum temperature (°C) threshold setting
Max
Maximum temperature threshold setting
Min
Minimum temperature threshold setting
Message
Message
Sensor Name
User-defined name for the temperature sensor
Previous State
State of the fan when data was captured in previous window
State
State of the fan when data was last captured
Temperature(Celsius)
Current temperature (°C) measured by sensor
To return to your workbench, click in the top right corner of the card.
View All Sensor Information
To view information for power supplies, fans, and temperature sensors on all switches and host servers, run:
netq show sensors all [around <text-time>] [json]
Use the around option to view sensor information for a time in the past.
This example shows all sensors on all devices.
cumulus@switch:~$ netq show sensors all
Matching sensors records:
Hostname Name Description State Message Last Changed
----------------- --------------- ----------------------------------- ---------- ----------------------------------- -------------------------
border01 fan5 fan tray 3, fan 1 ok Fri Aug 21 18:51:11 2020
border01 fan6 fan tray 3, fan 2 ok Fri Aug 21 18:51:11 2020
border01 fan1 fan tray 1, fan 1 ok Fri Aug 21 18:51:11 2020
...
fw1 fan2 fan tray 1, fan 2 ok Thu Aug 20 19:16:12 2020
...
fw2 fan3 fan tray 2, fan 1 ok Thu Aug 20 19:14:47 2020
...
leaf01 psu2fan1 psu2 fan ok Fri Aug 21 16:14:22 2020
...
leaf02 fan3 fan tray 2, fan 1 ok Fri Aug 21 16:14:14 2020
...
leaf03 fan2 fan tray 1, fan 2 ok Fri Aug 21 09:37:45 2020
...
leaf04 psu1fan1 psu1 fan ok Fri Aug 21 09:17:02 2020
...
spine01 psu2fan1 psu2 fan ok Fri Aug 21 05:54:14 2020
...
spine02 fan2 fan tray 1, fan 2 ok Fri Aug 21 05:54:39 2020
...
spine03 fan4 fan tray 2, fan 2 ok Fri Aug 21 06:00:52 2020
...
spine04 fan2 fan tray 1, fan 2 ok Fri Aug 21 05:54:09 2020
...
border01 psu1temp1 psu1 temp sensor ok Fri Aug 21 18:51:11 2020
border01 temp2 board sensor near virtual switch ok Fri Aug 21 18:51:11 2020
border01 temp3 board sensor at front left corner ok Fri Aug 21 18:51:11 2020
...
border02 temp1 board sensor near cpu ok Fri Aug 21 18:46:05 2020
...
fw1 temp4 board sensor at front right corner ok Thu Aug 20 19:16:12 2020
...
fw2 temp5 board sensor near fan ok Thu Aug 20 19:14:47 2020
...
leaf01 psu1temp1 psu1 temp sensor ok Fri Aug 21 16:14:22 2020
...
leaf02 temp5 board sensor near fan ok Fri Aug 21 16:14:14 2020
...
leaf03 psu2temp1 psu2 temp sensor ok Fri Aug 21 09:37:45 2020
...
leaf04 temp4 board sensor at front right corner ok Fri Aug 21 09:17:02 2020
...
spine01 psu1temp1 psu1 temp sensor ok Fri Aug 21 05:54:14 2020
...
spine02 temp3 board sensor at front left corner ok Fri Aug 21 05:54:39 2020
...
spine03 temp1 board sensor near cpu ok Fri Aug 21 06:00:52 2020
...
spine04 temp3 board sensor at front left corner ok Fri Aug 21 05:54:09 2020
...
border01 psu1 N/A ok Fri Aug 21 18:51:11 2020
border01 psu2 N/A ok Fri Aug 21 18:51:11 2020
border02 psu1 N/A ok Fri Aug 21 18:46:05 2020
border02 psu2 N/A ok Fri Aug 21 18:46:05 2020
fw1 psu1 N/A ok Thu Aug 20 19:16:12 2020
fw1 psu2 N/A ok Thu Aug 20 19:16:12 2020
fw2 psu1 N/A ok Thu Aug 20 19:14:47 2020
fw2 psu2 N/A ok Thu Aug 20 19:14:47 2020
leaf01 psu1 N/A ok Fri Aug 21 16:14:22 2020
leaf01 psu2 N/A ok Fri Aug 21 16:14:22 2020
leaf02 psu1 N/A ok Fri Aug 21 16:14:14 2020
leaf02 psu2 N/A ok Fri Aug 21 16:14:14 2020
leaf03 psu1 N/A ok Fri Aug 21 09:37:45 2020
leaf03 psu2 N/A ok Fri Aug 21 09:37:45 2020
leaf04 psu1 N/A ok Fri Aug 21 09:17:02 2020
leaf04 psu2 N/A ok Fri Aug 21 09:17:02 2020
spine01 psu1 N/A ok Fri Aug 21 05:54:14 2020
spine01 psu2 N/A ok Fri Aug 21 05:54:14 2020
spine02 psu1 N/A ok Fri Aug 21 05:54:39 2020
spine02 psu2 N/A ok Fri Aug 21 05:54:39 2020
spine03 psu1 N/A ok Fri Aug 21 06:00:52 2020
spine03 psu2 N/A ok Fri Aug 21 06:00:52 2020
spine04 psu1 N/A ok Fri Aug 21 05:54:09 2020
spine04 psu2 N/A ok Fri Aug 21 05:54:09 2020
View Only Power Supply Sensors
To view information from all PSU sensors or PSU sensors with a given name on your switches and host servers, run:
netq show sensors psu [<psu-name>] [around <text-time>] [json]
Use the psu-name option to view all PSU sensors with a particular name. Use the around option to view sensor information for a time in the past.
Use Tab completion to determine the names of the PSUs in your switches.
cumulus@switch:~$ netq show sensors psu <press tab>
around : Go back in time to around ...
json : Provide output in JSON
psu1 : Power Supply
psu2 : Power Supply
<ENTER>
This example shows information from all PSU sensors on all switches and hosts.
cumulus@switch:~$ netq show sensor psu
Matching sensors records:
Hostname Name State Pin(W) Pout(W) Vin(V) Vout(V) Message Last Changed
----------------- --------------- ---------- ------------ -------------- ------------ -------------- ----------------------------------- -------------------------
border01 psu1 ok Tue Aug 25 21:45:21 2020
border01 psu2 ok Tue Aug 25 21:45:21 2020
border02 psu1 ok Tue Aug 25 21:39:36 2020
border02 psu2 ok Tue Aug 25 21:39:36 2020
fw1 psu1 ok Wed Aug 26 00:08:01 2020
fw1 psu2 ok Wed Aug 26 00:08:01 2020
fw2 psu1 ok Wed Aug 26 00:02:13 2020
fw2 psu2 ok Wed Aug 26 00:02:13 2020
leaf01 psu1 ok Wed Aug 26 16:14:41 2020
leaf01 psu2 ok Wed Aug 26 16:14:41 2020
leaf02 psu1 ok Wed Aug 26 16:14:08 2020
leaf02 psu2 ok Wed Aug 26 16:14:08 2020
leaf03 psu1 ok Wed Aug 26 14:41:57 2020
leaf03 psu2 ok Wed Aug 26 14:41:57 2020
leaf04 psu1 ok Wed Aug 26 14:20:22 2020
leaf04 psu2 ok Wed Aug 26 14:20:22 2020
spine01 psu1 ok Wed Aug 26 10:53:17 2020
spine01 psu2 ok Wed Aug 26 10:53:17 2020
spine02 psu1 ok Wed Aug 26 10:54:07 2020
spine02 psu2 ok Wed Aug 26 10:54:07 2020
spine03 psu1 ok Wed Aug 26 11:00:44 2020
spine03 psu2 ok Wed Aug 26 11:00:44 2020
spine04 psu1 ok Wed Aug 26 10:52:00 2020
spine04 psu2 ok Wed Aug 26 10:52:00 2020
This example shows all PSUs with the name psu2.
cumulus@switch:~$ netq show sensors psu psu2
Matching sensors records:
Hostname Name State Message Last Changed
----------------- --------------- ---------- ----------------------------------- -------------------------
exit01 psu2 ok Fri Apr 19 16:01:17 2019
exit02 psu2 ok Fri Apr 19 16:01:33 2019
leaf01 psu2 ok Sun Apr 21 20:07:12 2019
leaf02 psu2 ok Fri Apr 19 16:01:41 2019
leaf03 psu2 ok Fri Apr 19 16:01:44 2019
leaf04 psu2 ok Fri Apr 19 16:01:36 2019
spine01 psu2 ok Fri Apr 19 16:01:52 2019
spine02 psu2 ok Fri Apr 19 16:01:08 2019
View Only Fan Sensors
To view information from all fan sensors or fan sensors with a given name on your switches and host servers, run:
netq show sensors fan [<fan-name>] [around <text-time>] [json]
Use the around option to view sensor information for a time in the past.
Use tab completion to determine the names of the fans in your switches:
cumulus@switch:~$ netq show sensors fan <<press tab>>
around : Go back in time to around ...
fan1 : Fan Name
fan2 : Fan Name
fan3 : Fan Name
fan4 : Fan Name
fan5 : Fan Name
fan6 : Fan Name
json : Provide output in JSON
psu1fan1 : Fan Name
psu2fan1 : Fan Name
<ENTER>
This example shows the state of all fans.
cumulus@switch:~$ netq show sensor fan
Matching sensors records:
Hostname Name Description State Speed Max Min Message Last Changed
----------------- --------------- ----------------------------------- ---------- ---------- -------- -------- ----------------------------------- -------------------------
border01 fan5 fan tray 3, fan 1 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 fan6 fan tray 3, fan 2 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 psu1fan1 psu1 fan ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 fan2 fan tray 1, fan 2 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border01 psu2fan1 psu2 fan ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border02 fan2 fan tray 1, fan 2 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 psu2fan1 psu2 fan ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 psu1fan1 psu1 fan ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 fan6 fan tray 3, fan 2 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 fan5 fan tray 3, fan 1 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
border02 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
fw1 fan2 fan tray 1, fan 2 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 fan5 fan tray 3, fan 1 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 psu1fan1 psu1 fan ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 psu2fan1 psu2 fan ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 fan6 fan tray 3, fan 2 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw1 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw2 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 psu2fan1 psu2 fan ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 fan2 fan tray 1, fan 2 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 fan6 fan tray 3, fan 2 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 fan5 fan tray 3, fan 1 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
fw2 psu1fan1 psu1 fan ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
leaf01 psu2fan1 psu2 fan ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan5 fan tray 3, fan 1 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan6 fan tray 3, fan 2 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan2 fan tray 1, fan 2 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 psu1fan1 psu1 fan ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf01 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Wed Aug 26 16:14:41 2020
leaf02 fan3 fan tray 2, fan 1 ok 2500 29000 2500 Wed Aug 26 16:14:08 2020
...
spine04 fan4 fan tray 2, fan 2 ok 2500 29000 2500 Wed Aug 26 10:52:00 2020
spine04 psu1fan1 psu1 fan ok 2500 29000 2500 Wed Aug 26 10:52:00 2020
This example shows the state of all fans with the name fan1.
cumulus@switch~$ netq show sensors fan fan1
Matching sensors records:
Hostname Name Description State Speed Max Min Message Last Changed
----------------- --------------- ----------------------------------- ---------- ---------- -------- -------- ----------------------------------- -------------------------
border01 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 21:45:21 2020
border02 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 21:39:36 2020
fw1 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 00:08:01 2020
fw2 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 00:02:13 2020
leaf01 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 18:30:07 2020
leaf02 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 18:08:38 2020
leaf03 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Tue Aug 25 21:20:34 2020
leaf04 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 14:20:22 2020
spine01 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 10:53:17 2020
spine02 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 10:54:07 2020
spine03 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 11:00:44 2020
spine04 fan1 fan tray 1, fan 1 ok 2500 29000 2500 Wed Aug 26 10:52:00 2020
View Only Temperature Sensors
To view information from all temperature sensors or temperature sensors with a given name on your switches and host servers, run:
netq show sensors temp [<temp-name>] [around <text-time>] [json]
Use the around option to view sensor information for a time in the past.
Use tab completion to determine the names of the temperature sensors on your devices:
cumulus@switch:~$ netq show sensors temp <press tab>
around : Go back in time to around ...
json : Provide output in JSON
psu1temp1 : Temp Name
psu2temp1 : Temp Name
temp1 : Temp Name
temp2 : Temp Name
temp3 : Temp Name
temp4 : Temp Name
temp5 : Temp Name
<ENTER>
This example shows the state of all temperature sensors.
cumulus@switch:~$ netq show sensor temp
Matching sensors records:
Hostname Name Description State Temp Critical Max Min Message Last Changed
----------------- --------------- ----------------------------------- ---------- -------- -------- -------- -------- ----------------------------------- -------------------------
border01 psu1temp1 psu1 temp sensor ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 temp2 board sensor near virtual switch ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 temp3 board sensor at front left corner ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 temp1 board sensor near cpu ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 temp4 board sensor at front right corner ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border01 temp5 board sensor near fan ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border02 temp1 board sensor near cpu ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 temp5 board sensor near fan ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 temp3 board sensor at front left corner ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 temp4 board sensor at front right corner ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 psu1temp1 psu1 temp sensor ok 25 85 80 5 Tue Aug 25 21:39:36 2020
border02 temp2 board sensor near virtual switch ok 25 85 80 5 Tue Aug 25 21:39:36 2020
fw1 temp4 board sensor at front right corner ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 temp3 board sensor at front left corner ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 psu1temp1 psu1 temp sensor ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 temp1 board sensor near cpu ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 temp2 board sensor near virtual switch ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 temp5 board sensor near fan ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw1 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw2 temp5 board sensor near fan ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 temp2 board sensor near virtual switch ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 temp3 board sensor at front left corner ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 temp4 board sensor at front right corner ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 temp1 board sensor near cpu ok 25 85 80 5 Wed Aug 26 00:02:13 2020
fw2 psu1temp1 psu1 temp sensor ok 25 85 80 5 Wed Aug 26 00:02:13 2020
leaf01 psu1temp1 psu1 temp sensor ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 temp5 board sensor near fan ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 temp4 board sensor at front right corner ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 temp1 board sensor near cpu ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 temp2 board sensor near virtual switch ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 temp3 board sensor at front left corner ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf01 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 16:14:41 2020
leaf02 temp5 board sensor near fan ok 25 85 80 5 Wed Aug 26 16:14:08 2020
...
spine04 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 10:52:00 2020
spine04 temp5 board sensor near fan ok 25 85 80 5 Wed Aug 26 10:52:00 2020
This example shows the state of all temperature sensors with the name psu2temp1.
cumulus@switch:~$ netq show sensors temp psu2temp1
Matching sensors records:
Hostname Name Description State Temp Critical Max Min Message Last Changed
----------------- --------------- ----------------------------------- ---------- -------- -------- -------- -------- ----------------------------------- -------------------------
border01 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 21:45:21 2020
border02 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 21:39:36 2020
fw1 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 00:08:01 2020
fw2 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 00:02:13 2020
leaf01 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 18:30:07 2020
leaf02 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 18:08:38 2020
leaf03 psu2temp1 psu2 temp sensor ok 25 85 80 5 Tue Aug 25 21:20:34 2020
leaf04 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 14:20:22 2020
spine01 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 10:53:17 2020
spine02 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 10:54:07 2020
spine03 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 11:00:44 2020
spine04 psu2temp1 psu2 temp sensor ok 25 85 80 5 Wed Aug 26 10:52:00 2020
View Digital Optics Information
Digital optics information is available from any digital optics modules in the system using the NetQ UI and NetQ CLI.
Digital Optics card: view laser bias current, laser output power, received signal average optical power, and module temperature/voltage (table)
netq show dom type command: view laser bias current, laser output power, received signal average optical power, and module temperature/voltage
Use the filter option to view laser power and bias current for a given interface and channel on a switch, and temperature and voltage for a given module. Select the relevant tab to view the data.
Click (main menu), then click Digital Optics in the Network heading.
The Laser Rx Power tab is displayed by default.
Laser Parameter
Description
Hostname
Name of the switch or host where the digital optics module resides
Timestamp
Date and time the data was captured
If Name
Name of interface where the digital optics module is installed
Units
Measurement unit for the power (mW) or current (mA)
Channel 1–8
Value of the power or current on each channel where the digital optics module is transmitting
Module Parameter
Description
Hostname
Name of the switch or host where the digital optics module resides
Timestamp
Date and time the data was captured
If Name
Name of interface where the digital optics module is installed
Degree C
Current module temperature, measured in degrees Celsius
Degree F
Current module temperature, measured in degrees Fahrenheit
Units
Measurement unit for module voltage; Volts
Value
Current module voltage
Click each of the other Laser or Module tabs to view that information for all devices.
To view digital optics information for your switches and host servers, run one of the following:
netq show dom type (laser_rx_power|laser_output_power|laser_bias_current) [interface <text-dom-port-anchor>] [channel_id <text-channel-id>] [around <text-time>] [json]
netq show dom type (module_temperature|module_voltage) [interface <text-dom-port-anchor>] [around <text-time>] [json]
This example shows module temperature information for all devices.
You can view software components deployed on all switches and hosts, or on all the switches in your network.
View the Operating Systems Information
Knowing what operating systems (OSs) you have deployed across your network is useful for upgrade planning and understanding your relative dependence on a given OS in your network.
OS information is available from the NetQ UI and NetQ CLI.
Inventory|Devices card
Medium: view the distribution of OSs and versions across all devices
Large: view the distribution of OSs and versions across all switches
Full-screen: view OS vendor, version, and version ID on all devices (table)
Inventory|Switches card
Medium/Large: view the distribution of OSs and versions across all switches (graphic)
Full-screen: view OS vendor, version, and version ID on all on all switches (table)
netq show inventory os
View OS name and version on all devices
Locate the medium Inventory|Devices card on your workbench.
Hover over the pie charts to view the total number of devices with a given operating system installed.
Change to the large card using the size picker.
Hover over a segment in the OS distribution chart to view the total number of devices with a given operating system installed.
Note that sympathetic highlighting (in blue) is employed to show which versions of the other switch components are associated with this OS.
Click on a segment in OS distribution chart.
Click Filter OS at the top of the popup.
The card updates to show only the components associated with switches running the selected OS. To return to all OSs, click X in the OS tag to remove the filter.
Change to the full-screen card using the size picker.
The All Switches tab is selected by default. Scroll to the right to locate all of the OS parameter data.
Click All Hosts to view the OS parameters for all host servers.
To return to your workbench, click in the top right corner of the card.
Locate the Inventory|Switches card on your workbench.
Hover over a segment of the OS graph in the distribution chart.
The same information is available on the summary tab of the large size card.
Hover over the card, and change to the full-screen card using the size picker.
Click OS.
To return to your workbench, click in the top right corner of the card.
To view OS information for your switches and host servers, run:
netq show inventory os [version <os-version>|name <os-name>] [json]
This example shows the OS information for all devices.
You can filter the results of the command to view only devices with a particular operating system or version. This can be especially helpful when you suspect that a particular device upgrade did not work as expected.
This example shows all devices with the Cumulus Linux version 3.7.12 installed.
cumulus@switch:~$ netq show inventory os version 3.7.12
Matching inventory records:
Hostname Name Version Last Changed
----------------- --------------- ------------------------------------ -------------------------
spine01 CL 3.7.12 Mon Aug 10 19:55:06 2020
spine02 CL 3.7.12 Mon Aug 10 19:55:07 2020
spine03 CL 3.7.12 Mon Aug 10 19:55:09 2020
spine04 CL 3.7.12 Mon Aug 10 19:55:08 2020
View the Supported Cumulus Linux Packages
When you are troubleshooting an issue with a switch, you might want to know all the supported versions of the Cumulus Linux operating system that are available for that switch and on a switch that is not having the same issue.
To view package information for your switches, run:
netq show cl-manifest [json]
This example shows the OS packages supported for all switches.
If you are having an issue with several switches, you should verify all the packages installed on them and compare that to the recommended packages for a given Cumulus Linux release.
To view installed package information for your switches, run:
netq show cl-pkg-info [<text-package-name>] [around <text-time>] [json]
Use the text-package-name option to narrow the results to a particular package or the around option to narrow the output to a particular time range.
This example shows all installed software packages for all devices.
cumulus@switch:~$ netq show cl-pkg-info
Matching package_info records:
Hostname Package Name Version CL Version Package Status Last Changed
----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
border01 libcryptsetup4 2:1.6.6-5 Cumulus Linux 3.7.13 installed Mon Aug 17 18:53:50 2020
border01 libedit2 3.1-20140620-2 Cumulus Linux 3.7.13 installed Mon Aug 17 18:53:50 2020
border01 libffi6 3.1-2+deb8u1 Cumulus Linux 3.7.13 installed Mon Aug 17 18:53:50 2020
...
border02 libdb5.3 9999-cl3u2 Cumulus Linux 3.7.13 installed Mon Aug 17 18:48:53 2020
border02 libnl-cli-3-200 3.2.27-cl3u15+1 Cumulus Linux 3.7.13 installed Mon Aug 17 18:48:53 2020
border02 pkg-config 0.28-1 Cumulus Linux 3.7.13 installed Mon Aug 17 18:48:53 2020
border02 libjs-sphinxdoc 1.2.3+dfsg-1 Cumulus Linux 3.7.13 installed Mon Aug 17 18:48:53 2020
...
fw1 libpcap0.8 1.8.1-3~bpo8+1 Cumulus Linux 3.7.13 installed Mon Aug 17 19:18:57 2020
fw1 python-eventlet 0.13.0-2 Cumulus Linux 3.7.13 installed Mon Aug 17 19:18:57 2020
fw1 libapt-pkg4.12 1.0.9.8.5-cl3u2 Cumulus Linux 3.7.13 installed Mon Aug 17 19:18:57 2020
fw1 libopts25 1:5.18.4-3 Cumulus Linux 3.7.13 installed Mon Aug 17 19:18:57 2020
...
This example shows the installed switchd package version.
cumulus@switch:~$ netq spine01 show cl-pkg-info switchd
Matching package_info records:
Hostname Package Name Version CL Version Package Status Last Changed
----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
spine01 switchd 1.0-cl3u40 Cumulus Linux 3.7.12 installed Thu Aug 27 01:58:47 2020
View Recommended Software Packages
You can determine whether any of your switches are using a software package other than the default package associated with the Cumulus Linux release that is running on the switches. Use this list to determine which packages to install/upgrade on all devices. Additionally, you can determine if a software package is missing.
To view recommended package information for your switches, run:
netq show recommended-pkg-version [release-id <text-release-id>] [package-name <text-package-name>] [json]
The output can be rather lengthy if you run this command for all releases and packages. If desired, run the command using the release-id and/or package-name options to shorten the output.
This example looks for switches running Cumulus Linux 3.7.1 and switchd. The result is a single switch, leaf12, that has older software and should get an update.
cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name switchd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
leaf12 3.7.1 vx x86_64 switchd 1.0-cl3u30 Wed Feb 5 04:36:30 2020
This example looks for switches running Cumulus Linux 3.7.1 and ptmd. The result is a single switch, server01, that has older software and should get an update.
cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name ptmd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
server01 3.7.1 vx x86_64 ptmd 3.0-2-cl3u8 Wed Feb 5 04:36:30 2020
This example looks for switches running Cumulus Linux 3.7.1 and lldpd. The result is a single switch, server01, that has older software and should get an update.
cumulus@switch:~$ netq show recommended-pkg-version release-id 3.7.1 package-name lldpd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
server01 3.7.1 vx x86_64 lldpd 0.9.8-0-cl3u11 Wed Feb 5 04:36:30 2020
This example looks for switches running Cumulus Linux 3.6.2 and switchd. The result is a single switch, leaf04, that has older software and should get an update.
cumulus@noc-pr:~$ netq show recommended-pkg-version release-id 3.6.2 package-name switchd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
leaf04 3.6.2 vx x86_64 switchd 1.0-cl3u27 Wed Feb 5 04:36:30 2020
View ACL Resources
Using the NetQ CLI, you can monitor the incoming and outgoing access control lists (ACLs) configured on all switches, currently or at a time in the past.
To view ACL resources for all your switches, run:
netq show cl-resource acl [ingress | egress] [around <text-time>] [json]
Use the egress or ingress options to show only the outgoing or incoming ACLs. Use the around option to show this information for a time in the past.
This example shows the ACL resources for all configured switches:
cumulus@switch:~$ netq show cl-resource acl
Matching cl_resource records:
Hostname In IPv4 filter In IPv4 Mangle In IPv6 filter In IPv6 Mangle In 8021x filter In Mirror In PBR IPv4 filter In PBR IPv6 filter Eg IPv4 filter Eg IPv4 Mangle Eg IPv6 filter Eg IPv6 Mangle ACL Regions 18B Rules Key 32B Rules Key 54B Rules Key L4 Port range Checke Last Updated
rs
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- ------------------------
act-5712-09 40,512(7%) 0,0(0%) 30,768(3%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 32,256(12%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 2,24(8%) Tue Aug 18 20:20:39 2020
mlx-2700-04 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 0,0(0%) 4,400(1%) 2,2256(0%) 0,1024(0%) 2,1024(0%) 0,0(0%) Tue Aug 18 20:19:08 2020
The same information can be output to JSON format:
NetQ Agent information is available from the NetQ UI and NetQ CLI.
Agents list
Full-screen: view NetQ Agent version across all devices (table)
Inventory|Switches card
Medium: view the number of unique versions of the NetQ Agent running on all devices
Large: view the number of unique versions of the NetQ Agent running on all devices and the associated OS
Full-screen: view NetQ Agent status and version across all devices
netq show agents
View NetQ Agent status, uptime, and version across all devices
To view the NetQ Agents on all switches and hosts:
Click to open the Main menu.
Select Agents from the Network column.
View the Version column to determine which release of the NetQ Agent is running on your devices. Ideally, this version should be the same as the NetQ release you are running, and is the same across all your devices.
Parameter
Description
Hostname
Name of the switch or host
Timestamp
Date and time the data was captured
Last Reinit
Date and time that the switch or host was reinitialized
Last Update Time
Date and time that the switch or host was updated
Lastboot
Date and time that the switch or host was last booted up
NTP State
Status of NTP synchronization on the switch or host; yes = in synchronization, no = out of synchronization
Sys Uptime
Amount of time the switch or host has been continuously up and running
Version
NetQ version running on the switch or host
It is recommended that when you upgrade NetQ that you also upgrade the NetQ Agents. You can determine if you have covered all of your agents using the medium or large Switch Inventory card. To view the NetQ Agent distribution by version:
Open the medium Switch Inventory card.
View the number in the Unique column next to Agent.
If the number is greater than one, you have multiple NetQ Agent versions deployed.
If you have multiple versions, hover over the Agent chart to view the count of switches using each version.
For more detail, switch to the large Switch Inventory card.
Hover over the card and click to open the Software tab.
Hover over the chart on the right to view the number of switches using the various versions of the NetQ Agent.
Hover over the Operating System chart to see which NetQ Agent versions are being run on each OS.
Click either chart to focus on a particular OS or agent version.
To return to the full view, click in the filter tag.
Filter the data on the card by switches that are having trouble communicating, by selecting Rotten Switches from the dropdown above the charts.
Open the full screen Inventory|Switches card. The Show All tab is displayed by default, and shows the NetQ Agent status and version for all devices.
To view the NetQ Agents on all switches and hosts, run:
netq show agents [fresh | rotten ] [around <text-time>] [json]
Use the fresh keyword to view only the NetQ Agents that are in current communication with the NetQ Platform or NetQ Collector. Use the rotten keyword to view those that are not. Use the around keyword to view the state of NetQ Agents at an earlier time.
This example shows the current NetQ Agent state on all devices. The Status column indicates whether the agent is up and current, labelled Fresh, or down and stale, labelled Rotten. Additional information includes the agent status — whether it is time synchronized, how long it has been up, and the last time its state changed. You can also see the version running. Ideally, this version should be the same as the NetQ release you are running, and is the same across all your devices.
cumulus@switch:~$ netq show agents
Matching agents records:
Hostname Status NTP Sync Version Sys Uptime Agent Uptime Reinitialize Time Last Changed
----------------- ---------------- -------- ------------------------------------ ------------------------- ------------------------- -------------------------- -------------------------
border01 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 28 18:48:31 2020 Tue Jul 28 18:49:46 2020 Tue Jul 28 18:49:46 2020 Sun Aug 23 18:56:56 2020
border02 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 28 18:43:29 2020 Tue Jul 28 18:44:42 2020 Tue Jul 28 18:44:42 2020 Sun Aug 23 18:49:57 2020
fw1 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 28 19:13:26 2020 Tue Jul 28 19:14:28 2020 Tue Jul 28 19:14:28 2020 Sun Aug 23 19:24:01 2020
fw2 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 28 19:11:27 2020 Tue Jul 28 19:12:51 2020 Tue Jul 28 19:12:51 2020 Sun Aug 23 19:21:13 2020
leaf01 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 14 21:04:03 2020 Wed Jul 29 16:12:22 2020 Wed Jul 29 16:12:22 2020 Sun Aug 23 16:16:09 2020
leaf02 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 14 20:59:10 2020 Wed Jul 29 16:12:23 2020 Wed Jul 29 16:12:23 2020 Sun Aug 23 16:16:48 2020
leaf03 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 14 21:04:03 2020 Tue Jul 14 21:18:23 2020 Tue Jul 14 21:18:23 2020 Sun Aug 23 21:25:16 2020
leaf04 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Tue Jul 14 20:57:30 2020 Tue Jul 14 20:58:48 2020 Tue Jul 14 20:58:48 2020 Sun Aug 23 21:09:06 2020
oob-mgmt-server Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 17:07:59 2020 Mon Jul 13 21:01:35 2020 Tue Jul 14 19:36:19 2020 Sun Aug 23 15:45:05 2020
server01 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:19 2020 Tue Jul 14 19:36:22 2020 Sun Aug 23 19:43:34 2020
server02 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:19 2020 Tue Jul 14 19:35:59 2020 Sun Aug 23 19:48:07 2020
server03 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:20 2020 Tue Jul 14 19:36:22 2020 Sun Aug 23 19:47:47 2020
server04 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:20 2020 Tue Jul 14 19:35:59 2020 Sun Aug 23 19:47:52 2020
server05 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:20 2020 Tue Jul 14 19:36:02 2020 Sun Aug 23 19:46:27 2020
server06 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 18:30:46 2020 Mon Jul 13 22:09:21 2020 Tue Jul 14 19:36:37 2020 Sun Aug 23 19:47:37 2020
server07 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 17:58:02 2020 Mon Jul 13 22:09:21 2020 Tue Jul 14 19:36:01 2020 Sun Aug 23 18:01:08 2020
server08 Fresh yes 3.1.0-ub18.04u28~1594095612.8f00ba1 Mon Jul 13 17:58:18 2020 Mon Jul 13 22:09:23 2020 Tue Jul 14 19:36:03 2020 Mon Aug 24 09:10:38 2020
spine01 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Mon Jul 13 17:48:43 2020 Mon Aug 10 19:55:07 2020 Mon Aug 10 19:55:07 2020 Sun Aug 23 19:57:05 2020
spine02 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Mon Jul 13 17:47:39 2020 Mon Aug 10 19:55:09 2020 Mon Aug 10 19:55:09 2020 Sun Aug 23 19:56:39 2020
spine03 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Mon Jul 13 17:47:40 2020 Mon Aug 10 19:55:12 2020 Mon Aug 10 19:55:12 2020 Sun Aug 23 19:57:29 2020
spine04 Fresh yes 3.1.0-cl3u28~1594095615.8f00ba1 Mon Jul 13 17:47:56 2020 Mon Aug 10 19:55:11 2020 Mon Aug 10 19:55:11 2020 Sun Aug 23 19:58:23 2020
Switch Inventory
With the NetQ UI and NetQ CLI, you can monitor your inventory of switches across the network or individually. A user can monitor such items as operating system, motherboard, ASIC, microprocessor, disk, memory, fan and power supply information. Being able to monitor this inventory aids in upgrades, compliance, and other planning tasks.
The commands and cards available to obtain this type of information help you to answer questions such as:
What hardware is installed on my switch?
How many transmit and receive packets have been dropped?
The NetQ UI provides the Inventory | Switches card for monitoring the hardware and software component inventory on switches running NetQ in your network. Access this card from the NetQ Workbench, or add it to your own workbench by clicking (Add card) > Inventory > Inventory|Switches card > Open Cards.
The CLI provides detailed switch inventory information through its netq <hostname> show inventory command.
View Switch Inventory Summary
Component information for all of the switches in your network can be viewed from both the NetQ UI and NetQ CLI.
Inventory|Switches card:
Small: view count of switches and distribution of switch status
Medium: view count of OS, ASIC, platform, CPU model, Disk, and memory types or versions across all switches
netq show inventory command:
-View ASIC, CPU, disk, OS, and ports on all switches
View the Number of Types of Any Component Deployed
For each of the components monitored on a switch, NetQ displays the variety of those component by way of a count. For example, if you have four operating systems running on your switches, say Cumulus Linux, SONiC, Ubuntu and RHEL, NetQ indicates a total unique count of three OSs. If you only use Cumulus Linux, then the count shows as one.
To view this count for all of the components on the switch:
Open the medium Switch Inventory card.
Note the number in the Unique column for each component.
In the above example, there are four different disk sizes deployed, four different OSs running, four different ASIC vendors and models deployed, and so forth.
Scroll down to see additional components.
By default, the data is shown for switches with a fresh communication status. You can choose to look at the data for switches in the rotten state instead. For example, if you wanted to see if there was any correlation to a version of OS to the switch having a rotten status, you could select Rotten Switches from the dropdown at the top of the card and see if they all use the same OS (count would be 1). It might not be the cause of the lack of communication, but you get the idea.
View the Distribution of Any Component Deployed
NetQ monitors a number of switch components. For each component you can view the distribution of versions or models or vendors deployed across your network for that component.
To view the distribution:
Locate the Inventory|Switches card on your workbench.
From the medium or large card, view the distribution of hardware and software components across the network. On the medium card, drop down the selection menu to select the desired component.
Hover over any of the segments in the distribution chart to highlight a specific component. Scroll down to view additional components.
When you hover, a tooltip appears displaying:
Name or value of the component type, such as the version number or status
Total number of switches with that type of component deployed compared to the total number of switches
Percentage of this type with respect to all component types
On the large Switch Inventory card, hovering also highlights the related components for the selected component.
Choose Rotten Switches from the dropdown to see which, if any, switches are currently not communicating with NetQ.
Return to your fresh switches, then hover over the card header and change to the small size card using the size picker.
Here you can see the total switch count and the distribution of those that are communicating well with the NetQ appliance or VM and those that are not. In this example, there are a total of 13 switches and they are all fresh (communicating well).
To view the hardware and software components for a switch, run:
netq <hostname> show inventory brief
This example shows the type of switch (Cumulus VX), operating system (Cumulus Linux), CPU (x86_62), and ASIC (virtual) for the spine01 switch.
cumulus@switch:~$ netq spine01 show inventory brief
Matching inventory records:
Hostname Switch OS CPU ASIC Ports
----------------- -------------------- --------------- -------- --------------- -----------------------------------
spine01 VX CL x86_64 VX N/A
This example show the components on the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory brief opta
Matching inventory records:
Hostname Switch OS CPU ASIC Ports
----------------- -------------------- --------------- -------- --------------- -----------------------------------
netq-ts N/A Ubuntu x86_64 N/A N/A
View Switch Hardware Inventory
You can view hardware components deployed on each switch in your network.
View ASIC Information for a Switch
You can view the ASIC information for a switch from either the NetQ CLI or NetQ UI.
Locate the medium Inventory|Switches card on your workbench.
Change to the full-screen card and click ASIC.
Note that if you are running CumulusVX switches, no detailed ASIC information is available because the hardware is virtualized.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Select hostname from the Field dropdown.
Enter the hostname of the switch you want to view, and click Apply.
To return to your workbench, click in the top right corner of the card.
To view information about the ASIC on a switch, run:
netq [<hostname>] show inventory asic [opta] [json]
This example shows the ASIC information for the leaf02 switch.
cumulus@switch:~$ netq leaf02 show inventory asic
Matching inventory records:
Hostname Vendor Model Model ID Core BW Ports
----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
leaf02 Mellanox Spectrum MT52132 N/A 32 x 100G-QSFP28
This example shows the ASIC information for the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory asic opta
Matching inventory records:
Hostname Vendor Model Model ID Core BW Ports
----------------- -------------------- ------------------------------ ------------------------- -------------- -----------------------------------
netq-ts Mellanox Spectrum MT52132 N/A 32 x 100G-QSFP28
View Motherboard Information for a Switch
Motherboard/platform information is available from the NetQ UI and NetQ CLI.
Inventory|Switches card
Medium/Large: view platform distribution across on all switches (graphic)
Full-screen: view platform vendor, model, manufacturing date, revision, serial number, MAC address, series for a switch (table)
netq show inventory board command
View motherboard vendor, model, base MAC address, serial number, part number, revision, and manufacturing date on a switch
Locate the medium Inventory|Switches card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
Click Platform.
Note that if you are running CumulusVX switches, no detailed platform information is available because the hardware is virtualized.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Select hostname from the Field dropdown.
Enter the hostname of the switch you want to view, and click Apply.
To return to your workbench, click in the top right corner of the card.
To view a list of motherboards installed in a switch, run:
netq [<hostname>] show inventory board [opta] [json]
This example shows all motherboard data for the spine01 switch.
cumulus@switch:~$ netq spine01 show inventory board
Matching inventory records:
Hostname Vendor Model Base MAC Serial No Part No Rev Mfg Date
----------------- -------------------- ------------------------------ ------------------ ------------------------- ---------------- ------ ----------
spine01 Dell S6000-ON 44:38:39:00:80:00 N/A N/A N/A N/A
Use the opta option without the hostname option to view the motherboard data for the NetQ On-premises or Cloud Appliance. No motherboard data is available for NetQ On-premises or Cloud VMs.
View CPU Information for a Switch
CPU information is available from the NetQ UI and NetQ CLI.
Inventory|Switches card: view CPU architecture, model, maximum operating frequency, the number of cores, and data on a switch (table)
netq show inventory cpu command: view CPU architecture, model, maximum operating frequency, and the number of cores on a switch
Locate the Inventory|Switches card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
Click CPU.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.
To return to your workbench, click in the top right corner of the card.
To view CPU information for a switch in your network, run:
netq [<hostname>] show inventory cpu [arch <cpu-arch>] [opta] [json]
This example shows CPU information for the server02 switch.
cumulus@switch:~$ netq server02 show inventory cpu
Matching inventory records:
Hostname Arch Model Freq Cores
----------------- -------- ------------------------------ ---------- -----
server02 x86_64 Intel Core i7 9xx (Nehalem Cla N/A 1
ss Core i7)
This example shows the CPU information for the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory cpu opta
Matching inventory records:
Hostname Arch Model Freq Cores
----------------- -------- ------------------------------ ---------- -----
netq-ts x86_64 Intel Xeon Processor (Skylake, N/A 8
IBRS)
View Disk Information for a Switch
Disk information is available from the NetQ UI and NetQ CLI.
Inventory|Switches card: view disk vendor, size, revision, model, name, transport, and type on a switch (table)
netq show inventory disk command: view disk name, type, transport, size, vendor, and model on all devices
Locate the Inventory|Switches card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
Click Disk.
Note that if you are running CumulusVX switches, no detailed disk information is available because the hardware is virtualized.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.
To return to your workbench, click in the top right corner of the card.
To view disk information for a switch in your network, run:
netq [<hostname>] show inventory disk [opta] [json]
This example shows the disk information for the leaf03 switch.
cumulus@switch:~$ netq leaf03 show inventory disk
Matching inventory records:
Hostname Name Type Transport Size Vendor Model
----------------- --------------- ---------------- ------------------ ---------- -------------------- ------------------------------
leaf03 vda disk N/A 6G 0x1af4 N/A
This example show the disk information for the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory disk opta
Matching inventory records:
Hostname Name Type Transport Size Vendor Model
----------------- --------------- ---------------- ------------------ ---------- -------------------- ------------------------------
netq-ts vda disk N/A 265G 0x1af4 N/A
View Memory Information for a Switch
Memory information is available from the NetQ UI and NetQ CLI.
Inventory|Switches card: view memory chip vendor, name, serial number, size, speed, and type on a switch (table)
netq show inventory memory: view memory chip name, type, size, speed, vendor, and serial number on all devices
Locate the medium Inventory|Switches card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
Click Memory.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Select hostname from the Field dropdown. Then enter the hostname of the switch you want to view.
To return to your workbench, click in the top right corner of the card.
To view memory information for your switches and host servers, run:
netq [<hostname>] show inventory memory [opta] [json]
This example shows all the memory characteristics for the leaf01 switch.
cumulus@switch:~$ netq leaf01 show inventory memory
Matching inventory records:
Hostname Name Type Size Speed Vendor Serial No
----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
leaf01 DIMM 0 RAM 768 MB Unknown QEMU Not Specified
This example shows the memory information for the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory memory opta
Matching inventory records:
Hostname Name Type Size Speed Vendor Serial No
----------------- --------------- ---------------- ---------- ---------- -------------------- -------------------------
netq-ts DIMM 0 RAM 16384 MB Unknown QEMU Not Specified
netq-ts DIMM 1 RAM 16384 MB Unknown QEMU Not Specified
netq-ts DIMM 2 RAM 16384 MB Unknown QEMU Not Specified
netq-ts DIMM 3 RAM 16384 MB Unknown QEMU Not Specified
View Switch Software Inventory
You can view software components deployed on a given switch in your network.
View Operating System Information for a Switch
OS information is available from the NetQ UI and NetQ CLI.
Inventory|Switches card: view OS vendor, version, and version ID on a switch (table)
netq show inventory os: view OS name and version on a switch
Locate the Inventory|Switches card on your workbench.
Hover over the card, and change to the full-screen card using the size picker.
Click OS.
Click to quickly locate a switch that does not appear on the first page of the switch list.
Enter a hostname, then click Apply.
To return to your workbench, click in the top right corner of the card.
To view OS information for a switch, run:
netq [<hostname>] show inventory os [opta] [json]
This example shows the OS information for the leaf02 switch.
cumulus@switch:~$ netq leaf02 show inventory os
Matching inventory records:
Hostname Name Version Last Changed
----------------- --------------- ------------------------------------ -------------------------
leaf02 CL 3.7.5 Fri Apr 19 16:01:46 2019
This example shows the OS information for the NetQ On-premises or Cloud Appliance.
cumulus@switch:~$ netq show inventory os opta
Matching inventory records:
Hostname Name Version Last Changed
----------------- --------------- ------------------------------------ -------------------------
netq-ts Ubuntu 18.04 Tue Jul 14 19:27:39 2020
View the Cumulus Linux Packages on a Switch
When you are troubleshooting an issue with a switch, you might want to know which supported versions of the Cumulus Linux operating system are available for that switch and on a switch that is not having the same issue.
To view package information for your switches, run:
netq <hostname> show cl-manifest [json]
This example shows the Cumulus Linux OS versions supported for the leaf01 switch, using the vx ASIC vendor (virtual, so simulated) and x86_64 CPU architecture.
If you are having an issue with a particular switch, you should verify all the installed software and whether it needs updating.
To view package information for a switch, run:
netq <hostname> show cl-pkg-info [<text-package-name>] [around <text-time>] [json]
Use the text-package-name option to narrow the results to a particular package or the around option to narrow the output to a particular time range.
This example shows all installed software packages for spine01.
cumulus@switch:~$ netq spine01 show cl-pkg-info
Matching package_info records:
Hostname Package Name Version CL Version Package Status Last Changed
----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
spine01 libfile-fnmatch-perl 0.02-2+b1 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 screen 4.2.1-3+deb8u1 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 libudev1 215-17+deb8u13 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 libjson-c2 0.11-4 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 atftp 0.7.git20120829-1+de Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
b8u1
spine01 isc-dhcp-relay 4.3.1-6-cl3u14 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 iputils-ping 3:20121221-5+b2 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 base-files 8+deb8u11 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 libx11-data 2:1.6.2-3+deb8u2 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 onie-tools 3.2-cl3u6 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 python-cumulus-restapi 0.1-cl3u10 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 tasksel 3.31+deb8u1 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 ncurses-base 5.9+20140913-1+deb8u Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
3
spine01 libmnl0 1.0.3-5-cl3u2 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
spine01 xz-utils 5.1.1alpha+20120614- Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
...
This example shows the ntp package on the spine01 switch.
cumulus@switch:~$ netq spine01 show cl-pkg-info ntp
Matching package_info records:
Hostname Package Name Version CL Version Package Status Last Changed
----------------- ------------------------ -------------------- -------------------- -------------------- -------------------------
spine01 ntp 1:4.2.8p10-cl3u2 Cumulus Linux 3.7.12 installed Wed Aug 26 19:58:45 2020
View Recommended Software Packages
If you have a software manifest, you can determine the recommended packages and versions for a particular Cumulus Linux release. You can then compare that to the software already installed on your switch(es) to determine if it differs from the manifest. Such a difference might occur if you upgraded one or more packages separately from the Cumulus Linux software itself.
To view recommended package information for a switch, run:
netq <hostname> show recommended-pkg-version [release-id <text-release-id>] [package-name <text-package-name>] [json]
This example shows the recommended packages for upgrading the leaf12 switch, namely switchd.
cumulus@switch:~$ netq leaf12 show recommended-pkg-version
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
leaf12 3.7.1 vx x86_64 switchd 1.0-cl3u30 Wed Feb 5 04:36:30 2020
This example shows the recommended packages for upgrading the server01 switch, namely lldpd.
cumulus@switch:~$ netq server01 show recommended-pkg-version
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
server01 3.7.1 vx x86_64 lldpd 0.9.8-0-cl3u11 Wed Feb 5 04:36:30 2020
This example shows the recommended version of the switchd package for use with Cumulus Linux 3.7.2.
cumulus@switch:~$ netq act-5712-09 show recommended-pkg-version release-id 3.7.2 package-name switchd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
act-5712-09 3.7.2 bcm x86_64 switchd 1.0-cl3u31 Wed Feb 5 04:36:30 2020
This example shows the recommended version of the switchd package for use with Cumulus Linux 3.1.0. Note the version difference from the example for Cumulus Linux 3.7.2.
cumulus@noc-pr:~$ netq act-5712-09 show recommended-pkg-version release-id 3.1.0 package-name switchd
Matching manifest records:
Hostname Release ID ASIC Vendor CPU Arch Package Name Version Last Changed
----------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------------
act-5712-09 3.1.0 bcm x86_64 switchd 1.0-cl3u4 Wed Feb 5 04:36:30 2020
Validate NetQ Agents are Running
You can confirm that NetQ Agents are running on switches and hosts (if installed) using the netq show agents command. Viewing the Status column of the output indicates whether the agent is up and current, labelled Fresh, or down and stale, labelled Rotten. Additional information includes the agent status — whether it is time synchronized, how long it has been up, and the last time its state changed.
This example shows NetQ Agent state on all devices.
View the state of the NetQ Agent on a given device using the
hostname keyword.
View only the NetQ Agents that are fresh or rotten using the fresh or rotten keyword.
View the state of NetQ Agents at an earlier time using the around
keyword.
Monitor Software Services
Cumulus Linux, SONiC and NetQ run many services to deliver the various features of these products. You can monitor their status using the netq show services command. This section describes services related to system-level operation. For monitoring other services, such as those related to routing, see those topics. NetQ automatically monitors the following services:
aclinit: aclinit service
acltool: acltool service
bgp: BGP (Border Gateway Protocol) service
bgpd: BGP daemon
chrony: chrony service
clagd: MLAG (Multi-chassis Link Aggregation) daemon
cumulus-chassis-ssh: cumulus-chassis-ssh
cumulus-chassisd: cumulus-chassisd
database: database
dhcp_relay: DHCP relay service
docker: Docker container service
ledmgrd: Switch LED manager daemon
lldp: LLDP (Link Layer Discovery Protocol) service
lldpd: LLDP daemon
mstpd: MSTP (Multiple Spanning Tree Protocol) daemon
neighmgrd: Neighbor manager daemon for BGP and OSPF
netq-agent: NetQ Agent service
netqd: NetQ application daemon
ntp: Network Time Protocol (NTP) service
pmon: Process monitor service
portwd: Port watch daemon
ptmd: PTM (Prescriptive Topology Manager) daemon
pwmd: Password manager daemon
radv: Route advertiser service
rsyslog: Rocket-fast system event logging processing service
smond: System monitor daemon
ssh: Secure shell service for switches and servers
status: Show services with a given status (ok, error, warning, fail)
switchd: Cumulus Linux switchd service for hardware acceleration
swss: SONiC switch state service daemon
sx_sdk: Spectrum ASIC SDK service
syncd: Synchronization service
syslog: System event logging service
teamd: Network team service
vrf: VRF (Virtual Route Forwarding) service
wd_keepalive: Software watchdog service
zebra: GNU Zebra routing daemon
The CLI syntax for viewing the status of services is:
netq [<hostname>] show services [<service-name>] [vrf <vrf>] [active|monitored] [around <text-time>] [json]
netq [<hostname>] show services [<service-name>] [vrf <vrf>] status (ok|warning|error|fail) [around <text-time>] [json]
netq [<hostname>] show events [level info | level error | level warning | level debug] type services [between <text-time> and <text-endtime>] [json]
View All Services on All Devices
This example shows all available services on each device and whether each is enabled, active, and monitored, along with how long the service has been running and the last time it changed.
It is useful to have colored output for this show command. To configure colored output, run the netq config add color command.
cumulus@switch:~$ netq show services
Hostname Service PID VRF Enabled Active Monitored Status Uptime Last Changed
----------------- -------------------- ----- --------------- ------- ------ --------- ---------------- ------------------------- -------------------------
leaf01 bgpd 2872 default yes yes yes ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 clagd n/a default yes no yes n/a 1d:6h:43m:35s Fri Feb 15 17:28:48 2019
leaf01 ledmgrd 1850 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 lldpd 2651 default yes yes yes ok 1d:6h:43m:27s Fri Feb 15 17:28:56 2019
leaf01 mstpd 1746 default yes yes yes ok 1d:6h:43m:35s Fri Feb 15 17:28:48 2019
leaf01 neighmgrd 1986 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 netq-agent 8654 mgmt yes yes yes ok 1d:6h:43m:29s Fri Feb 15 17:28:54 2019
leaf01 netqd 8848 mgmt yes yes yes ok 1d:6h:43m:29s Fri Feb 15 17:28:54 2019
leaf01 ntp 8478 mgmt yes yes yes ok 1d:6h:43m:29s Fri Feb 15 17:28:54 2019
leaf01 ptmd 2743 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 pwmd 1852 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 smond 1826 default yes yes yes ok 1d:6h:43m:27s Fri Feb 15 17:28:56 2019
leaf01 ssh 2106 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 syslog 8254 default yes yes no ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf01 zebra 2856 default yes yes yes ok 1d:6h:43m:59s Fri Feb 15 17:28:24 2019
leaf02 bgpd 2867 default yes yes yes ok 1d:6h:43m:55s Fri Feb 15 17:28:28 2019
leaf02 clagd n/a default yes no yes n/a 1d:6h:43m:31s Fri Feb 15 17:28:53 2019
leaf02 ledmgrd 1856 default yes yes no ok 1d:6h:43m:55s Fri Feb 15 17:28:28 2019
leaf02 lldpd 2646 default yes yes yes ok 1d:6h:43m:30s Fri Feb 15 17:28:53 2019
...
You can also view services information in JSON format:
If you want to view the service information for a given device, use the hostname option when running the command.
View Information about a Given Service on All Devices
You can view the status of a given service at the current time, at a prior point in time, or view the changes that have occurred for the service during a specified timeframe.
This example shows how to view the status of the NTP service across the network. In this case, the VRF configuration has the NTP service running on both the default and management interface. You can perform the same command with the other services, such as bgpd, lldpd, and clagd.
cumulus@switch:~$ netq show services ntp
Matching services records:
Hostname Service PID VRF Enabled Active Monitored Status Uptime Last Changed
----------------- -------------------- ----- --------------- ------- ------ --------- ---------------- ------------------------- -------------------------
exit01 ntp 8478 mgmt yes yes yes ok 1d:6h:52m:41s Fri Feb 15 17:28:54 2019
exit02 ntp 8497 mgmt yes yes yes ok 1d:6h:52m:36s Fri Feb 15 17:28:59 2019
firewall01 ntp n/a default yes yes yes ok 1d:6h:53m:4s Fri Feb 15 17:28:31 2019
hostd-11 ntp n/a default yes yes yes ok 1d:6h:52m:46s Fri Feb 15 17:28:49 2019
hostd-21 ntp n/a default yes yes yes ok 1d:6h:52m:37s Fri Feb 15 17:28:58 2019
hosts-11 ntp n/a default yes yes yes ok 1d:6h:52m:28s Fri Feb 15 17:29:07 2019
hosts-13 ntp n/a default yes yes yes ok 1d:6h:52m:19s Fri Feb 15 17:29:16 2019
hosts-21 ntp n/a default yes yes yes ok 1d:6h:52m:14s Fri Feb 15 17:29:21 2019
hosts-23 ntp n/a default yes yes yes ok 1d:6h:52m:4s Fri Feb 15 17:29:31 2019
noc-pr ntp 2148 default yes yes yes ok 1d:6h:53m:43s Fri Feb 15 17:27:52 2019
noc-se ntp 2148 default yes yes yes ok 1d:6h:53m:38s Fri Feb 15 17:27:57 2019
spine01 ntp 8414 mgmt yes yes yes ok 1d:6h:53m:30s Fri Feb 15 17:28:05 2019
spine02 ntp 8419 mgmt yes yes yes ok 1d:6h:53m:27s Fri Feb 15 17:28:08 2019
spine03 ntp 8443 mgmt yes yes yes ok 1d:6h:53m:22s Fri Feb 15 17:28:13 2019
leaf01 ntp 8765 mgmt yes yes yes ok 1d:6h:52m:52s Fri Feb 15 17:28:43 2019
leaf02 ntp 8737 mgmt yes yes yes ok 1d:6h:52m:46s Fri Feb 15 17:28:49 2019
leaf11 ntp 9305 mgmt yes yes yes ok 1d:6h:49m:22s Fri Feb 15 17:32:13 2019
leaf12 ntp 9339 mgmt yes yes yes ok 1d:6h:49m:9s Fri Feb 15 17:32:26 2019
leaf21 ntp 9367 mgmt yes yes yes ok 1d:6h:49m:5s Fri Feb 15 17:32:30 2019
leaf22 ntp 9403 mgmt yes yes yes ok 1d:6h:52m:57s Fri Feb 15 17:28:38 2019
This example shows the status of the BGP daemon.
cumulus@switch:~$ netq show services bgpd
Matching services records:
Hostname Service PID VRF Enabled Active Monitored Status Uptime Last Changed
----------------- -------------------- ----- --------------- ------- ------ --------- ---------------- ------------------------- -------------------------
exit01 bgpd 2872 default yes yes yes ok 1d:6h:54m:37s Fri Feb 15 17:28:24 2019
exit02 bgpd 2867 default yes yes yes ok 1d:6h:54m:33s Fri Feb 15 17:28:28 2019
firewall01 bgpd 21766 default yes yes yes ok 1d:6h:54m:54s Fri Feb 15 17:28:07 2019
spine01 bgpd 2953 default yes yes yes ok 1d:6h:55m:27s Fri Feb 15 17:27:34 2019
spine02 bgpd 2948 default yes yes yes ok 1d:6h:55m:23s Fri Feb 15 17:27:38 2019
spine03 bgpd 2953 default yes yes yes ok 1d:6h:55m:18s Fri Feb 15 17:27:43 2019
leaf01 bgpd 3221 default yes yes yes ok 1d:6h:54m:48s Fri Feb 15 17:28:13 2019
leaf02 bgpd 3177 default yes yes yes ok 1d:6h:54m:42s Fri Feb 15 17:28:19 2019
leaf11 bgpd 3521 default yes yes yes ok 1d:6h:51m:18s Fri Feb 15 17:31:43 2019
leaf12 bgpd 3527 default yes yes yes ok 1d:6h:51m:6s Fri Feb 15 17:31:55 2019
leaf21 bgpd 3512 default yes yes yes ok 1d:6h:51m:1s Fri Feb 15 17:32:00 2019
leaf22 bgpd 3536 default yes yes yes ok 1d:6h:54m:54s Fri Feb 15 17:28:07 2019
View Events Related to a Given Service
To view changes over a given time period, use the netq show events command. For more detailed information about events, refer to Events and Notifications.
This example shows changes to the bgpd service in the last 48 hours.
cumulus@switch:/$ netq show events type bgp between now and 48h
Matching events records:
Hostname Message Type Severity Message Timestamp
----------------- ------------ -------- ----------------------------------- -------------------------
leaf01 bgp info BGP session with peer spine-1 swp3. 1d:6h:55m:37s
3 vrf DataVrf1081 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-2 swp4. 1d:6h:55m:37s
3 vrf DataVrf1081 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-3 swp5. 1d:6h:55m:37s
3 vrf DataVrf1081 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-1 swp3. 1d:6h:55m:37s
2 vrf DataVrf1080 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-3 swp5. 1d:6h:55m:37s
2 vrf DataVrf1080 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-2 swp4. 1d:6h:55m:37s
2 vrf DataVrf1080 state changed fro
m failed to Established
leaf01 bgp info BGP session with peer spine-3 swp5. 1d:6h:55m:37s
4 vrf DataVrf1082 state changed fro
m failed to Established
System Inventory
In addition to network and switch inventory, the NetQ UI provides a view into the current status and configuration of the software network constructs in a tabular, networkwide view. These are helpful when you want to see all the data for all of a particular element in your network for troubleshooting, or you want to export a list view.
Some of these views provide data that is also available through the card workflows, but these views are not treated like cards. They only provide the current status; you cannot change the time period of the views, or graph the data within the UI.
Access these tables through the Main Menu (), under the Network heading.
Tables can be manipulated using the settings above the tables, shown here and described in Table Settings.
Pagination options are shown when there are more than 25 results.
View All NetQ Agents
The Agents view provides all available parameter data about all NetQ Agents in the system.
Parameter
Description
Hostname
Name of the switch or host
Timestamp
Date and time the data was captured
Last Reinit
Date and time that the switch or host was reinitialized
Last Update Time
Date and time that the switch or host was updated
Lastboot
Date and time that the switch or host was last booted up
NTP State
Status of NTP synchronization on the switch or host; yes = in synchronization, no = out of synchronization
Sys Uptime
Amount of time the switch or host has been continuously up and running
Version
NetQ version running on the switch or host
View All Events
The Events view provides all available parameter data about all events in the system.
Parameter
Description
Hostname
Name of the switch or host that experienced the event
Timestamp
Date and time the event was captured
Message
Description of the event
Message Type
Network service or protocol that generated the event
Severity
Importance of the event. Values include critical, warning, info, and debug.
View All MACs
The MACs (media access control addresses) view provides all available parameter data about all MAC addresses in the system.
Parameter
Description
Hostname
Name of the switch or host where the MAC address resides
Timestamp
Date and time the data was captured
Egress Port
Port where traffic exits the switch or host
Is Remote
Indicates if the address is
Is Static
Indicates if the address is a static (true) or dynamic assignment (false)
MAC Address
MAC address
Nexthop
Next hop for traffic hitting this MAC address on this switch or host
Origin
Indicates if address is owned by this switch or host (true) or by a peer (false)
VLAN
VLAN associated with the MAC address, if any
View All VLANs
The VLANs (virtual local area networks) view provides all available parameter data about all VLANs in the system.
Parameter
Description
Hostname
Name of the switch or host where the VLAN(s) reside(s)
Timestamp
Date and time the data was captured
If Name
Name of interface used by the VLAN(s)
Last Changed
Date and time when this information was last updated
Ports
Ports on the switch or host associated with the VLAN(s)
SVI
Switch virtual interface associated with a bridge interface
VLANs
VLANs associated with the switch or host
View IP Routes
The IP Routes view provides all available parameter data about all IP routes. The list of routes can be filtered to view only the IPv4 or IPv6 routes by selecting the relevant tab.
Parameter
Description
Hostname
Name of the switch or host where the VLAN(s) reside(s)
Timestamp
Date and time the data was captured
Is IPv6
Indicates if the address is an IPv6 (true) or IPv4 (false) address
Message Type
Network service or protocol; always Route in this table
Nexthops
Possible ports/interfaces where traffic can be routed to next
Origin
Indicates if this switch or host is the source of this route (true) or not (false)
Prefix
IPv4 or IPv6 address prefix
Priority
Rank of this route to be used before another, where the lower the number, less likely is to be used; value determined by routing protocol
Protocol
Protocol responsible for this route
Route Type
Type of route
Rt Table ID
The routing table identifier where the route resides
Src
Prefix of the address where the route is coming from (the previous hop)
VRF
Associated virtual route interface associated with this route
View IP Neighbors
The IP Neighbors view provides all available parameter data about all IP neighbors. The list of neighbors can be filtered to view only the IPv4 or IPv6 neighbors by selecting the relevant tab.
Parameter
Description
Hostname
Name of the neighboring switch or host
Timestamp
Date and time the data was captured
IF Index
Index of interface used to communicate with this neighbor
If Name
Name of interface used to communicate with this neighbor
IP Address
IPv4 or IPv6 address of the neighbor switch or host
Is IPv6
Indicates if the address is an IPv6 (true) or IPv4 (false) address
Is Remote
Indicates if the address is
MAC Address
MAC address of the neighbor switch or host
Message Type
Network service or protocol; always Neighbor in this table
VRF
Associated virtual route interface associated with this neighbor
View IP Addresses
The IP Addresses view provides all available parameter data about all IP addresses. The list of addresses can be filtered to view only the IPv4 or IPv6 addresses by selecting the relevant tab.
Parameter
Description
Hostname
Name of the neighboring switch or host
Timestamp
Date and time the data was captured
If Name
Name of interface used to communicate with this neighbor
Is IPv6
Indicates if the address is an IPv6 (true) or IPv4 (false) address
Mask
Host portion of the address
Prefix
Network portion of the address
VRF
Virtual route interface associated with this address prefix and interface on this switch or host
Device Groups allow you to create a label for a subset of devices in the inventory. You can configure validation checks to run on select devices by referencing group names.
Create a Device Group
To create a device group, add the Device Groups card to your workbench. Click to navigate to the Device Groups section and click Open Cards after selecting the Device groups card:
The Device groups card will now be displayed on your workbench. Click Create New Group to create a new device group:
The Create New Group wizard will be displayed. To finish creating a new group:
Set the name of the group of devices
Declare a hostname-based rule to define which devices in the inventory should be added to the group
Confirm the expected matched devices appear in the inventory, and click Create device group
The following example shows a group name of “exit group” matching any device in the inventory with “exit” in the hostname:
Updating a Device Group
When new devices that match existing group riles are added to the inventory, those devices matching the rule criteria will be flagged for review to be added to the group inventory. The following example shows the switch “exit-2” being detected in the inventory after the group was already configured:
To add the new device to the group inventory, click and then click Update device group.
Removing a Device Group
To delete a device group:
Expand the Device Groups card:
Click on the desired group and select Delete.
DPU Inventory
DPU monitoring is an early access feature.
With the NetQ UI, you can monitor your inventory of DPUs across the network or individually. A user can monitor a network’s operating system, ASIC, CPU model, disk, and memory information to help manage upgrades, compliance, and other planning tasks.
The Inventory | DPU card monitors the hardware- and software-component inventory on DPUs in your network. Access this card from the NetQ Workbench, or add it to your own workbench by clicking (Add card) > Inventory > Inventory | DPU card > Open Cards.
View DPU Components
NetQ displays DPU status and components on the Inventory | DPU card as a donut chart. The number fresh and rotten DPUs will be displayed in the card. Additionally, you can view data for the following DPU components:
Disk
Operating system
ASIC
Agent version
CPU
Platform
Memory
Hover over the chart in the default card view to view component details. To view the distribution of components, hover over the card header and increase the card’s size:
You can hover over the card header and select the desired icon to view a detailed chart for ASIC, platform, or software components:
To display the advanced view, use the size picker to expand the card to its largest size, then select the desired component:
Monitor Hardware Utilization
To monitor DPU hardware resource utilization, see Monitor DPUs.
With the NetQ UI, you can monitor your inventory of hosts across the network or individually. A user can monitor a host’s operating system, ASIC, CPU model, disk, and memory information to help manage upgrades, compliance, and other planning tasks.
Access Host Inventory Data
The Inventory | Hosts card monitors the hardware- and software-component inventory on hosts running NetQ in your network. Access this card from the NetQ Workbench, or add it to your own workbench by clicking (Add card) > Inventory > Inventory | Hosts card > Open Cards.
View Host Components
NetQ displays host status and components on the Inventory | Hosts card as a donut chart. The number fresh and rotten hosts will be displayed in the card. Additionally, you can view data for the following host components:
Disk
Operating system
ASIC
CPU
Platform
Memory
Hover over the chart in the default card view to view component details. To view the distribution of components, hover over the card header and increase the card’s size. You can hover over the card header and select the desired icon to view a detailed chart for ASIC, platform, or software components:
To display the advanced view, use the size picker to expand the card to its largest size, then select the desired component:
Monitor Container Environments Using Kubernetes API Server
The NetQ Agent monitors many aspects of containers on your network by integrating with the Kubernetes API server. In particular, the NetQ Agent tracks:
Identity: Every container’s IP and MAC address, name, image, and more. NetQ can locate containers across the fabric based on a container’s name, image, IP or MAC address, and protocol and port pair.
Port mapping on a network: Protocol and ports exposed by a container. NetQ can identify containers exposing a specific protocol and port pair on a network.
Connectivity: Information about network connectivity for a container, including adjacency and identifying a top of rack switch’s effects on containers.
This topic assumes a reasonable familiarity with Kubernetes terminology and architecture.
Use NetQ with Kubernetes Clusters
The NetQ Agent interfaces with the Kubernetes API server and listens to Kubernetes events. The NetQ Agent monitors network identity and physical network connectivity of Kubernetes resources like pods, daemon sets, services, and so forth. NetQ works with any container network interface (CNI), such as Calico or Flannel.
The NetQ Kubernetes integration enables network administrators to:
Identify and locate pods, deployment, replica-set and services deployed within the network using IP, name, label, and so forth.
Track network connectivity of all pods of a service, deployment and replica set.
Locate what pods have been deployed adjacent to a top of rack (ToR) switch.
Check the impact on a pod, services, replica set or deployment by a specific ToR switch.
NetQ also helps network administrators identify changes within a Kubernetes cluster and determine if such changes had an adverse effect on the network performance (caused by a noisy neighbor for example). Additionally, NetQ helps the infrastructure administrator determine the distribution of Kubernetes workloads within a network.
Requirements
The NetQ Agent supports Kubernetes version 1.9.2 or later.
Command Summary
A large set of commands are available to monitor Kubernetes configurations, including the ability to monitor clusters, nodes, daemon-set, deployment, pods, replication, and services. Run netq show kubernetes help to see all the possible commands.
After waiting for a minute, run the show command to view the cluster.
cumulus@host:~$netq show kubernetes cluster
Next, you must enable the NetQ Agent on every worker node for complete insight into your container network. Repeat steps 2 and 3 on each worker node.
View Status of Kubernetes Clusters
Run the netq show kubernetes cluster command to view the status of all Kubernetes clusters in the fabric. The following example shows two clusters; one with server11 as the master server and the other with server12 as the master server. Both are healthy and both list their associated worker nodes.
cumulus@host:~$ netq show kubernetes cluster
Matching kube_cluster records:
Master Cluster Name Controller Status Scheduler Status Nodes
------------------------ ---------------- -------------------- ---------------- --------------------
server11:3.0.0.68 default Healthy Healthy server11 server13 se
rver22 server11 serv
er12 server23 server
24
server12:3.0.0.69 default Healthy Healthy server12 server21 se
rver23 server13 serv
er14 server21 server
22
For deployments with multiple clusters, you can use the hostname option to filter the output. This example shows filtering of the list by server11:
cumulus@host:~$ netq server11 show kubernetes cluster
Matching kube_cluster records:
Master Cluster Name Controller Status Scheduler Status Nodes
------------------------ ---------------- -------------------- ---------------- --------------------
server11:3.0.0.68 default Healthy Healthy server11 server13 se
rver22 server11 serv
er12 server23 server
24
Optionally, use the json option to present the results in JSON format.
If data collection from the NetQ Agents is not occurring as it did previously, verify that no changes made to the Kubernetes cluster configuration use the around option. Be sure to include the unit of measure with the around value. Valid units include:
w: weeks
d: days
h: hours
m: minutes
s: seconds
now
This example shows changes that made to the cluster in the last hour. This example shows the addition of the two master nodes and the various worker nodes for each cluster.
cumulus@host:~$ netq show kubernetes cluster around 1h
Matching kube_cluster records:
Master Cluster Name Controller Status Scheduler Status Nodes DBState Last changed
------------------------ ---------------- -------------------- ---------------- ---------------------------------------- -------- -------------------------
server11:3.0.0.68 default Healthy Healthy server11 server13 server22 server11 serv Add Fri Feb 8 01:50:50 2019
er12 server23 server24
server12:3.0.0.69 default Healthy Healthy server12 server21 server23 server13 serv Add Fri Feb 8 01:50:50 2019
er14 server21 server22
server12:3.0.0.69 default Healthy Healthy server12 server21 server23 server13 Add Fri Feb 8 01:50:50 2019
server11:3.0.0.68 default Healthy Healthy server11 Add Fri Feb 8 01:50:50 2019
server12:3.0.0.69 default Healthy Healthy server12 Add Fri Feb 8 01:50:50 2019
View Kubernetes Pod Information
You can show configuration and status of the pods in a cluster, including the names, labels, addresses, associated cluster and containers, and whether the pod is running. This example shows pods for FRR, nginx, Calico, and various Kubernetes components sorted by master node.
You can view detailed information about a node, including their role in the cluster, pod CIDR and kubelet status. This example shows all the nodes in the cluster with server11 as the master. Note that server11 acts as a worker node along with the other nodes in the cluster, server12, server13, server22, server23, and server24.
To display the kubelet or Docker version, use the components option with the show command. This example lists the kublet version, a proxy address if used, and the status of the container for server11 master and worker nodes.
To view only the details for a selected node, the name option with the hostname of that node following the components option:
cumulus@host:~$ netq server11 show kubernetes node components name server13
Matching kube_cluster records:
Master Cluster Name Node Name Kubelet KubeProxy Container Runt
ime
------------------------ ---------------- -------------------- ------------ ------------ ----------------- --------------
server11:3.0.0.68 default server13 v1.9.2 v1.9.2 docker://17.3.2 KubeletReady
View Kubernetes Replica Set on a Node
You can view information about the replica set, including the name, labels, and number of replicas present for each application. This example shows the number of replicas for each application in the server11 cluster:
You can view information about the daemon set running on the node. This example shows that six copies of the cumulus-frr daemon are running on the server11 node:
cumulus@host:~$ netq server11 show kubernetes daemon-set namespace default
Matching kube_daemonset records:
Master Cluster Name Namespace Daemon Set Name Labels Desired Count Ready Count Last Changed
------------------------ ------------ ---------------- ------------------------------ -------------------- ------------- ----------- ----------------
server11:3.0.0.68 default default cumulus-frr k8s-app:cumulus-frr 6 6 14h:25m:37s
View Pods on a Node
You can view information about the pods on the node. The first example shows all pods running nginx in the default namespace for the server11 cluster. The second example shows all pods running any application in the default namespace for the server11 cluster.
cumulus@host:~$ netq server11 show kubernetes pod namespace default label nginx
Matching kube_pod records:
Master Namespace Name IP Node Labels Status Containers Last Changed
------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
server11:3.0.0.68 default nginx-8586cf59-26pj5 10.244.9.193 server24 run:nginx Running nginx:6e2b65070c86 14h:25m:24s
server11:3.0.0.68 default nginx-8586cf59-c82ns 10.244.40.128 server12 run:nginx Running nginx:01b017c26725 14h:25m:24s
server11:3.0.0.68 default nginx-8586cf59-wjwgp 10.244.49.64 server22 run:nginx Running nginx:ed2b4254e328 14h:25m:24s
cumulus@host:~$ netq server11 show kubernetes pod namespace default label app
Matching kube_pod records:
Master Namespace Name IP Node Labels Status Containers Last Changed
------------------------ ------------ -------------------- ---------------- ------------ -------------------- -------- ------------------------ ----------------
server11:3.0.0.68 default httpd-5456469bfd-bq9 10.244.49.65 server22 app:httpd Running httpd:79b7f532be2d 14h:20m:34s
zm
server11:3.0.0.68 default influxdb-6cdb566dd-8 10.244.162.128 server13 app:influx Running influxdb:15dce703cdec 14h:20m:34s
9lwn
View Status of the Replication Controller on a Node
After you create the replicas, you can then view information about the replication controller:
cumulus@host:~$ netq server11 show kubernetes replication-controller
No matching kube_replica records found
View Kubernetes Deployment Information
For each depolyment, you can view the number of replicas associated with an application. This example shows information for a deployment of the nginx application:
cumulus@host:~$ netq server11 show kubernetes deployment name nginx
Matching kube_deployment records:
Master Namespace Name Replicas Ready Replicas Labels Last Changed
------------------------ --------------- -------------------- ---------------------------------- -------------- ------------------------------ ----------------
server11:3.0.0.68 default nginx 3 3 run:nginx 14h:27m:20s
Search Using Labels
You can search for information about your Kubernetes clusters using labels. A label search is similar to a “contains” regular expression search. The following example looks for all nodes that contain kube in the replication set name or label:
You can view the connectivity graph of a Kubernetes pod, seeing its replica set, deployment or service level. The connectivity graph starts with the server where you deployed the pod, and shows the peer for each server interface. This data appears in a similar manner as the netq trace command, showing the interface name, the outbound port on that interface, and the inbound port on the peer.
In this example shows connectivity at the deployment level, where the nginx-8586cf59-wjwgp replica is in a pod on the server22 node. It has four possible commumication paths, through interfaces swp1-4 out varying ports to peer interfaces swp7 and swp20 on torc-21, torc-22, edge01 and edge02 nodes. Similarly, it shows the connections for two additional nginx replicas.
You can show details about the Kubernetes services in a cluster, including service name, labels associated with the service, type of service, associated IP address, an external address if a public service, and ports used. This example show the services available in the Kubernetes cluster:
You can filter the list to view details about a particular Kubernetes service using the name option, as shown here:
cumulus@host:~$ netq show kubernetes service name calico-etcd
Matching kube_service records:
Master Namespace Service Name Labels Type Cluster IP External IP Ports Last Changed
------------------------ ---------------- -------------------- ------------ ---------- ---------------- ---------------- ----------------------------------- ----------------
server11:3.0.0.68 kube-system calico-etcd k8s-app:cali ClusterIP 10.96.232.136 TCP:6666 2d:13h:48m:10s
co-etcd
server12:3.0.0.69 kube-system calico-etcd k8s-app:cali ClusterIP 10.96.232.136 TCP:6666 2d:13h:49m:3s
co-etcd
View Kubernetes Service Connectivity
To see the connectivity of a given Kubernetes service, include the connectivity option. This example shows the connectivity of the calico-etcd service:
View the Impact of Connectivity Loss for a Service
You can preview the impact on the service availabilty based on the loss of particular node using the impact option. The output is color coded (not shown in the example below) so you can clearly see the impact: green shows no impact, yellow shows partial impact, and red shows full impact.
cumulus@host:~$ netq server11 show impact kubernetes service name calico-etcd
calico-etcd -- calico-etcd-pfg9r -- server11:swp1:torbond1 -- swp6:hostbond2:torc-11
-- server11:swp2:torbond1 -- swp6:hostbond2:torc-12
-- server11:swp3:NetQBond-2 -- swp16:NetQBond-16:edge01
-- server11:swp4:NetQBond-2 -- swp16:NetQBond-16:edge02
View Kubernetes Cluster Configuration in the Past
You can use the around option to go back in time to check the network status and identify any changes that occurred on the network.
This example shows the current state of the network. Notice there is a node named server23. server23 is there because the node server22 went down and Kubernetes spun up a third replica on a different host to satisfy the deployment requirement.
View the Impact of Connectivity Loss for a Deployment
You can determine the impact on the Kubernetes deployment in the event a host or switch goes down. The output is color coded (not shown in the example below) so you can clearly see the impact: green shows no impact, yellow shows partial impact, and red shows full impact.
If you need to perform maintenance on the Kubernetes cluster itself, use the following commands to bring the cluster down and then back up.
If you need, get the list of all the nodes in the Kubernetes cluster:
cumulus@host:~$ kubectl get nodes
Have Kubernetes to drain the node so that the pods running on it are gracefully scheduled elsewhere:
cumulus@host:~$ kubectl drain <node name>
After the maintenance window is over, put the node back into the cluster so that Kubernetes can start scheduling pods on it again:
cumulus@host:~$ kubectl uncordon <node name>
Events and Notifications
Events provide information about how a network and its devices are operating during a given time period. Event notifications are available through Slack, PagerDuty, syslog, and email channels to aid troubleshooting and help resolve network problems before they become critical.
NetQ captures three types of events:
System: wide range of events generated by the system about network protocols and services operation, hardware and software status, and system services
Threshold-based (TCA): selected set of system-related events generated based on user-configured threshold values
What Just Happened (WJH): network hardware events generated when you enable the WJH feature on NVIDIA Spectrum™ switches
You can track events in the NetQ UI with the Events and WJH cards:
Events card: tracks all warning, info, error, debug, and TCA events for a given time frame
What Just Happened card: tracks network hardware events on NVIDIA Spectrum™ switches
The NetQ CLI provides the netq show events command to view system and TCA events for a given time frame. The netq show wjh-drop command lists all WJH events or those with a selected drop type.
Configure System Event Notifications
To receive the event messages generated and processed by NetQ, you must integrate a third-party event notification application into your workflow. You can integrate NetQ with Syslog, PagerDuty, Slack, and/or email. Alternately, you can send notifications to other third-party applications via a generic webhook channel.
In an on-premises deployment, the NetQ On-premises Appliance or VM receives the raw data stream from the NetQ Agents, processes the data, then stores and delivers events to the Notification function. The Notification function filters and sends messages to any configured notification applications. In a cloud deployment, the NetQ Cloud Appliance or VM passes the raw data stream to the NetQ Cloud service for processing and delivery.
You can implement a proxy server (that sits between the NetQ Appliance or VM and the integration channels) that receives, processes, and distributes the notifications rather than having them sent directly to the integration channel. If you use such a proxy, you must configure NetQ with the proxy information.
Notifications are generated for the following types of events:
Category
Events
Network Protocol Validations
BGP status and session state
MLAG (CLAG) status and session state
EVPN status and session state
LLDP status
OSPF status and session state
VLAN status and session state
VXLAN status and session state
Interfaces
Link status
Ports and cables status
MTU status
Services
NetQ Agent status
PTM
SSH *
NTP status
Traces
On-demand trace status
Scheduled trace status
Sensors
Fan status
PSU (power supply unit) status
Temperature status
System Software
Configuration File changes
Running Configuration File changes
Cumulus Linux Support status
Software Package status
Operating System version
Lifecycle Management status
System Hardware
Physical resources status
BTRFS status
SSD utilization status
* This type of event can only be viewed in the CLI with this release.
Event filters are based on rules you create. You must have at least one rule per filter. A select set of events can be triggered by a user-configured threshold. Refer to the System Event Messages Reference for descriptions and examples of these events.
Event Message Format
Messages have the following structure:
<message-type><timestamp><opid><hostname><severity><message>
Identifier of the service or process that generated the event
hostname
Hostname of network device where event occurred
severity
Severity level in which the given event is classified; debug, error, info, or warning
message
Text description of event
For example:
You can integrate notification channels using the NetQ UI or the NetQ CLI.
To set up the integrations, you must configure NetQ with at least one channel, one rule, and one filter. To refine what messages you want to view and where to send them, you can add additional rules and filters and set thresholds on supported event types. You can also configure a proxy server to receive, process, and forward the messages. This is accomplished using the NetQ UI and NetQ CLI in the following order:
Configure Basic NetQ Event Notifications
The simplest configuration you can create is one that sends all events generated by all interfaces to a single notification application. This is described here. For more granular configurations and examples, refer to Configure Advanced NetQ Event Notifications.
A notification configuration must contain one channel, one rule, and one filter. Creation of the configuration follows this same path:
Add a channel.
Add a rule that accepts a selected set of events.
Add a filter that associates this rule with the newly created channel.
Create a Channel
The first step is to create a PagerDuty, Slack, syslog, or email channel to receive the notifications.
You can use the NetQ UI or the NetQ CLI to create a Slack channel.
Click , and then click Notification Channels in the Notifications section.
The Slack tab is displayed by default.
Add a channel.
When no channels have been specified, click Add Slack Channel.
When at least one channel has been specified, click above the table.
Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.
Create an incoming webhook as described in the documentation for your version of Slack. Then copy and paste it here.
Click Add.
To verify the channel configuration, click Test.
Otherwise, click Close.
To return to your workbench, click in the top right corner of the card.
To create and verify the specification of a Slack channel, run:
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- -------- ----------------------
slk-netq-events slack info webhook:https://hooks.s
lack.com/services/text/
moretext/evenmoretext
You can use the NetQ UI or the NetQ CLI to create a PagerDuty channel.
Click , and then click Notification Channels in the Notifications section.
Click PagerDuty.
Add a channel.
When no channels have been specified, click Add PagerDuty Channel.
When at least one channel has been specified, click above the table.
Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.
Obtain and enter an integration key (also called a service key or routing key).
Click Add.
Verify it is correctly configured.
Otherwise, click Close.
To return to your workbench, click in the top right corner of the card.
To create and verify the specification of a PagerDuty channel, run:
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: c6d666e
210a8425298ef7abde0d1998
You can use the NetQ UI or the NetQ CLI to create a Slack channel.
Click , and then click Notification Channels in the Notifications section.
Click Syslog.
Add a channel.
When no channels have been specified, click Add Syslog Channel.
When at least one channel has been specified, click above the table.
Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.
Enter the IP address and port of the Syslog server.
Click Add.
To verify the channel configuration, click Test.
Otherwise, click Close.
To return to your workbench, click in the top right corner of the card.
To create and verify the specification of a syslog channel, run:
netq add notification channel syslog <text-channel-name> hostname <text-syslog-hostname> port <text-syslog-port> [severity info | severity warning | severity error | severity debug]
netq show notification channel [json]
This example shows the creation of a syslog-netq-events channel and verifies the configuration.
Obtain the syslog server hostname (or IP address) and port.
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- -------- ----------------------
syslog-netq-eve syslog info host:syslog-server
nts port: 514
You can use the NetQ UI or the NetQ CLI to create an email channel.
Click , and then click Notification Channels in the Notifications section.
Click Email.
Add a channel.
When no channels have been specified, click Add Email Channel.
When at least one channel has been specified, click above the table.
Provide a unique name for the channel. Note that spaces are not allowed. Use dashes or camelCase instead.
Enter a list of emails for the people who you want to receive notifications from this channel.
Enter the emails separated by commas, and no spaces. For example: user1@domain.com,user2@domain.com,user3@domain.com
The first time you configure an email channel, you must also specify the SMTP server information:
Host: hostname or IP address of the SMTP server
Port: port of the SMTP server; typically 587
User ID/Password: your administrative credentials
From: email address that indicates who sent the event messages
After the first time, any additional email channels you create can use this configuration, by clicking Existing.
Click Add.
To verify the channel configuration, click Test.
Otherwise, click Close.
To return to your workbench, click .
To create and verify the specification of an email channel, run:
netq add notification channel email <text-channel-name> to <text-email-toids> [smtpserver <text-email-hostname>] [smtpport <text-email-port>] [login <text-email-id>] [password <text-email-password>] [severity info | severity warning | severity error | severity debug]
netq add notification channel email <text-channel-name> to <text-email-toids>
netq show notification channel [json]
The configuration is different depending on whether you are using the on-premises or cloud version of NetQ. Do not configure SMTP for cloud deployments as the NetQ cloud service uses the NetQ SMTP server to push email notifications.
For an on-premises deployment:
Set up an SMTP server. The server can be internal or public.
Create a user account (login and password) on the SMTP server. NetQ sends notifications to this address.
Create the notification channel using this form of the CLI command:
This example creates a rule named all-interfaces, using the key ifname and the value ALL, which sends all events from all interfaces to any channel with this rule.
cumulus@switch:~$ netq add notification rule all-interfaces key ifname value ALL
Successfully added/updated rule all-ifs
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
all-interfaces ifname ALL
If you want to create more granular notifications based on such items as selected devices, characteristics of devices, or protocols, or you want to use a proxy server, you need more than the basic notification configuration. The following section includes details for creating these more complex notification configurations.
Configure a Proxy Server
To send notification messages through a proxy server instead of directly to a notification channel, you configure NetQ with the hostname and optionally a port of a proxy server. If you do not specify a port, NetQ defaults to port 80. Only one proxy server is currently supported. To simplify deployment, configure your proxy server before configuring channels, rules, or filters.
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: c6d666e
210a8425298ef7abde0d1998
NetQ Notifier sends notifications to Slack as incoming webhooks for a
Slack channel you configure.
For example:
To create and verify the specification of a Slack channel, run:
WebHook URL for the desired channel. For example: https://hooks.slack.com/services/text/moretext/evenmoretext
severity <level>
The log level to set, which can be one of error, warning, info, or debug. The severity defaults to info.
tag <text-slack-tag>
Optional tag appended to the Slack notification to highlight particular channels or people. An @ sign must precede the tag value. For example, @netq-info.
This example shows the creation of a slk-netq-events channel and verifies the configuration.
Create an incoming webhook as described in the documentation for your version of Slack.
This example creates an email channel named onprem-email that uses the smtpserver on port 587 to send messages to those persons with access to the smtphostlogin account.
Set up an SMTP server. The server can be internal or public.
Create a user account (login and password) on the SMTP server. NetQ sends notifications to this address.
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
onprem-email email warning password: MyPassword123,
port: 587,
isEncrypted: True,
host: smtp.domain.com,
from: smtphostlogin@doma
in.com,
id: smtphostlogin@domain
.com,
to: netq-notifications@d
omain.com
In cloud deployments as the NetQ cloud service uses the NetQ SMTP server to push email notifications.
To create an email notification channel for a cloud deployment, run:
netq add notification channel email <text-channel-name> to <text-email-toids> [severity info | severity warning | severity error | severity debug]
netq show notification channel [json]
This example creates an email channel named cloud-email that uses the NetQ SMTP server to send messages to those persons with access to the netq-cloud-notifications account.
URL of the remote application to receive notifications
severity <level>
The log level to set, which can be one of error, warning, info, or debug. The severity defaults to info.
use-ssl [True | False]
Enable or disable SSL
auth-type [basic-auth | api-key]
Set authentication parameters. Either basic-auth with generic-username and generic-password or api-key with a key-name and key-value
Create Rules
A single key-value pair comprises each rule. The key-value pair indicates what messages to include or drop from event information sent to a notification channel. You can create more than one rule for a single filter. Creating multiple rules for a given filter can provide a very defined filter. For example, you can specify rules around hostnames or interface names, enabling you to filter messages specific to those hosts or interfaces. You can only create rules after you have set up your notification channels.
NetQ includes a predefined fixed set of valid rule keys. You enter values as regular expressions, which vary according to your deployment.
Rule Keys and Values
Service
Rule Key
Description
Example Rule Values
BGP
message_type
Network protocol or service identifier
bgp
hostname
User-defined, text-based name for a switch or host
server02, leaf11, exit01, spine-4
peer
User-defined, text-based name for a peer switch or host
server4, leaf-3, exit02, spine06
desc
Text description
vrf
Name of VRF interface
mgmt, default
old_state
Previous state of the BGP service
Established, Failed
new_state
Current state of the BGP service
Established, Failed
old_last_reset_time
Previous time that BGP service was reset
Apr3, 2019, 4:17 PM
new_last_reset_time
Most recent time that BGP service was reset
Apr8, 2019, 11:38 AM
ConfigDiff
message_type
Network protocol or service identifier
configdiff
hostname
User-defined, text-based name for a switch or host
server02, leaf11, exit01, spine-4
vni
Virtual Network Instance identifier
12, 23
old_state
Previous state of the configuration file
created, modified
new_state
Current state of the configuration file
created, modified
EVPN
message_type
Network protocol or service identifier
evpn
hostname
User-defined, text-based name for a switch or host
server02, leaf-9, exit01, spine04
vni
Virtual Network Instance identifier
12, 23
old_in_kernel_state
Previous VNI state, in kernel or not
true, false
new_in_kernel_state
Current VNI state, in kernel or not
true, false
old_adv_all_vni_state
Previous VNI advertising state, advertising all or not
true, false
new_adv_all_vni_state
Current VNI advertising state, advertising all or not
true, false
LCM
message_type
Network protocol or service identifier
clag
hostname
User-defined, text-based name for a switch or host
server02, leaf-9, exit01, spine04
old_conflicted_bonds
Previous pair of interfaces in a conflicted bond
swp7 swp8, swp3 swp4
new_conflicted_bonds
Current pair of interfaces in a conflicted bond
swp11 swp12, swp23 swp24
old_state_protodownbond
Previous state of the bond
protodown, up
new_state_protodownbond
Current state of the bond
protodown, up
Link
message_type
Network protocol or service identifier
link
hostname
User-defined, text-based name for a switch or host
server02, leaf-6, exit01, spine7
ifname
Software interface name
eth0, swp53
LLDP
message_type
Network protocol or service identifier
lldp
hostname
User-defined, text-based name for a switch or host
server02, leaf41, exit01, spine-5, tor-36
ifname
Software interface name
eth1, swp12
old_peer_ifname
Previous software interface name
eth1, swp12, swp27
new_peer_ifname
Current software interface name
eth1, swp12, swp27
old_peer_hostname
Previous user-defined, text-based name for a peer switch or host
server02, leaf41, exit01, spine-5, tor-36
new_peer_hostname
Current user-defined, text-based name for a peer switch or host
server02, leaf41, exit01, spine-5, tor-36
MLAG (CLAG)
message_type
Network protocol or service identifier
clag
hostname
User-defined, text-based name for a switch or host
server02, leaf-9, exit01, spine04
old_conflicted_bonds
Previous pair of interfaces in a conflicted bond
swp7 swp8, swp3 swp4
new_conflicted_bonds
Current pair of interfaces in a conflicted bond
swp11 swp12, swp23 swp24
old_state_protodownbond
Previous state of the bond
protodown, up
new_state_protodownbond
Current state of the bond
protodown, up
Node
message_type
Network protocol or service identifier
node
hostname
User-defined, text-based name for a switch or host
server02, leaf41, exit01, spine-5, tor-36
ntp_state
Current state of NTP service
in sync, not sync
db_state
Current state of DB
Add, Update, Del, Dead
NTP
message_type
Network protocol or service identifier
ntp
hostname
User-defined, text-based name for a switch or host
server02, leaf-9, exit01, spine04
old_state
Previous state of service
in sync, not sync
new_state
Current state of service
in sync, not sync
Port
message_type
Network protocol or service identifier
port
hostname
User-defined, text-based name for a switch or host
server02, leaf13, exit01, spine-8, tor-36
ifname
Interface name
eth0, swp14
old_speed
Previous speed rating of port
10 G, 25 G, 40 G, unknown
old_transreceiver
Previous transceiver
40G Base-CR4, 25G Base-CR
old_vendor_name
Previous vendor name of installed port module
Amphenol, OEM, NVIDIA, Fiberstore, Finisar
old_serial_number
Previous serial number of installed port module
MT1507VS05177, AVE1823402U, PTN1VH2
old_supported_fec
Previous forward error correction (FEC) support status
User-defined, text-based name for a switch or host
server02, leaf-26, exit01, spine2-4
old_state
Previous state of a fan, power supply unit, or thermal sensor
Fan: ok, absent, bad PSU: ok, absent, bad Temp: ok, busted, bad, critical
new_state
Current state of a fan, power supply unit, or thermal sensor
Fan: ok, absent, bad PSU: ok, absent, bad Temp: ok, busted, bad, critical
old_s_state
Previous state of a fan or power supply unit.
Fan: up, down PSU: up, down
new_s_state
Current state of a fan or power supply unit.
Fan: up, down PSU: up, down
new_s_max
Current maximum temperature threshold value
Temp: 110
new_s_crit
Current critical high temperature threshold value
Temp: 85
new_s_lcrit
Current critical low temperature threshold value
Temp: -25
new_s_min
Current minimum temperature threshold value
Temp: -50
Services
message_type
Network protocol or service identifier
services
hostname
User-defined, text-based name for a switch or host
server02, leaf03, exit01, spine-8
name
Name of service
clagd, lldpd, ssh, ntp, netqd, netq-agent
old_pid
Previous process or service identifier
12323, 52941
new_pid
Current process or service identifier
12323, 52941
old_status
Previous status of service
up, down
new_status
Current status of service
up, down
Rule names are case sensitive, and you cannot use wildcards. Rule names can contain spaces, but you must enclose them with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use *bgpSessionChanges* or *BGP-session-changes* or *BGPsessions*, instead of *BGP Session Changes*. Use tab completion to view the command options syntax.
cumulus@switch:~$ netq add notification rule swp52 key port value swp52
Successfully added/updated rule swp52
View Rule Configurations
Use the netq show notification command to view the rules on your
platform.
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
fecSupport new_supported_fe supported
c
overTemp new_s_crit 24
svcStatus new_status down
swp52 port swp52
sysconf configdiff updated
Create Filters
You can limit or direct event messages using filters. Filters are created based on rules you define and each filter contains one or more rules. When a message matches the rule, it is sent to the indicated destination. Before you can create filters, you need to have already defined rules and configured channels.
As you create filters, they are added to the bottom of a list of filters. By default, NetQ processes event messages against filters starting at the top of the filter list and works its way down until it finds a match. NetQ applies the first filter that matches an event message, ignoring the other filters. Then it moves to the next event message and reruns the process, starting at the top of the list of filters. NetQ ignores events that do not match any filter.
You mght have to change the order of filters in the list to ensure you capture the events you want and drop the events you do not want. This is possible using the before or after keywords to ensure one rule is processed before or after another.
This diagram shows an example with four defined filters with sample output results.
Filter names can contain spaces, but must be enclosed with single quotes in commands. It is easier to use dashes in place of spaces or mixed case for better readability. For example, use bgpSessionChanges or BGP-session-changes or BGPsessions, instead of 'BGP Session Changes'. Filter names are also case sensitive.
Example Filters
Create a filter for BGP events on a particular device:
Create a filter to drop messages from a given interface, and match
against this filter before any other filters. To create a drop-style
filter, do not specify a channel. To list the filter first, use the
before option.
Use the netq show notification command to view the filters on your
platform.
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
swp52Drop 1 error NetqDefaultChann swp52
el
bgpSpine 2 info pd-netq-events bgpHostnam
e
vni42 3 warning pd-netq-events evpnVni
configChange 4 info slk-netq-events sysconf
newFEC 5 info slk-netq-events fecSupport
svcDown 6 critical slk-netq-events svcStatus
critTemp 7 critical onprem-email overTemp
Reorder Filters
In the netq show notification filter command above, the drop-based filter is listed first and the critical events filters are listed last. Because NetQ processes notifications based on the filters’ order, reordering the events so that the critical events appear higher up in the list makes sense. To reorder the critical events filters, use the before and after options.
For example, to put the two critical event filters just below the drop filter:
You do not need to reenter all the severity, channel, and rule information for existing rules if you only want to change their processing order.
Run the netq show notification command again to verify the changes:
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
swp52Drop 1 error NetqDefaultChann swp52
el
critTemp 2 critical onprem-email overTemp
svcDown 3 critical slk-netq-events svcStatus
bgpSpine 4 info pd-netq-events bgpHostnam
e
vni42 5 warning pd-netq-events evpnVni
configChange 6 info slk-netq-events sysconf
newFEC 7 info slk-netq-events fecSupport
Suppress Events
NetQ can generate many network events. You can create rules to suppress events so that they do not appear using either the Events card or the CLI. Suppressing events is particularly useful for reducing the number of event notifications attributable to known issues or false alarms.
You can set time parameters to suppress events in a given time period. If you do not configure time parameters, the event is suppressed for two years. If you are testing a new network configuration, a switch may generate many messages. Creating a suppression rule that applies over a short time frame can be useful for silencing messages to limit distractions.
You can suppress events for the following types of messages:
agent: NetQ Agent messages
bgp: BGP-related messages
btrfsinfo: Messages related to the BTRFS file system in Cumulus Linux
clag: MLAG-related messages
clsupport: Messages generated when creating the cl-support script
configdiff: Messages related to the difference between two configurations
evpn: EVPN-related messages
link: Messages related to links, including state and interface name
ntp: NTP-related messages
ospf: OSPF-related messages
sensor: Messages related to various sensors
services: Service-related information, including whether a service is active or inactive
ssdutil: Messages related to the storage on the switch
Add an Event Suppression Configuration
You can suppress events using the NetQ UI or NetQ CLI.
To suppress events using the NetQ UI:
Click (main menu).
In the side navigation under Network, click Events.
In the table, navigate to the column labeled Suppress Events.
Hover over the row and select Suppress events to create parameters for the suppression rule. You can configure individual suppression rules or you can create a group rule that suppresses events for all message types.
Enter the suppression rule parameters and click Create.
When you add a new configuration using the CLI, you can specify a scope, which limits the suppression in the following order:
Hostname.
Severity.
Message type-specific filters. For example, the target VNI for EVPN messages, or the interface name for a link message.
NetQ has a predefined set of filter conditions. To see these conditions, run netq show events-config show-filter-conditions:
cumulus@switch:~$ netq show events-config show-filter-conditions
Matching config_events records:
Message Name Filter Condition Name Filter Condition Hierarchy Filter Condition Description
------------------------ ------------------------------------------ ---------------------------------------------------- --------------------------------------------------------
evpn vni 3 Target VNI
evpn severity 2 Severity error/info
evpn hostname 1 Target Hostname
clsupport fileAbsName 3 Target File Absolute Name
clsupport severity 2 Severity error/info
clsupport hostname 1 Target Hostname
link new_state 4 up / down
link ifname 3 Target Ifname
link severity 2 Severity error/info
link hostname 1 Target Hostname
ospf ifname 3 Target Ifname
ospf severity 2 Severity error/info
ospf hostname 1 Target Hostname
sensor new_s_state 4 New Sensor State Eg. ok
sensor sensor 3 Target Sensor Name Eg. Fan, Temp
sensor severity 2 Severity error/info
sensor hostname 1 Target Hostname
configdiff old_state 5 Old State
configdiff new_state 4 New State
configdiff type 3 File Name
configdiff severity 2 Severity error/info
configdiff hostname 1 Target Hostname
ssdutil info 3 low health / significant health drop
ssdutil severity 2 Severity error/info
ssdutil hostname 1 Target Hostname
agent db_state 3 Database State
agent severity 2 Severity error/info
agent hostname 1 Target Hostname
ntp new_state 3 yes / no
ntp severity 2 Severity error/info
ntp hostname 1 Target Hostname
bgp vrf 4 Target VRF
bgp peer 3 Target Peer
bgp severity 2 Severity error/info
bgp hostname 1 Target Hostname
services new_status 4 active / inactive
services name 3 Target Service Name Eg.netqd, mstpd, zebra
services severity 2 Severity error/info
services hostname 1 Target Hostname
btrfsinfo info 3 high btrfs allocation space / data storage efficiency
btrfsinfo severity 2 Severity error/info
btrfsinfo hostname 1 Target Hostname
clag severity 2 Severity error/info
clag hostname 1 Target Hostname
For example, to create a configuration called mybtrfs that suppresses OSPF-related events on leaf01 for the next 10 minutes, run:
You can remove event suppression configurations using the NetQ UI or NetQ CLI.
To remove suppressed event configurations:
Click (main menu).
In the side navigation under Network, click Events.
Select Show suppression rules at the top of the page.
Navigate to the rule you would like to delete. Click the three-dot menu and select Delete. If you’d like to pause the rule instead of deleting it, click Disable.
To remove an event suppression configuration, run netq del events-config events_config_id <text-events-config-id-anchor>.
When you filter for a message type, you must include the show-filter-conditions keyword to show the conditions associated with that message type and the hierarchy in which they get processed.
The following section lists examples of advanced notification configurations.
Create a Notification for BGP Events from a Selected Switch
This example creates a notification integration with a PagerDuty channel called pd-netq-events. It then creates a rule bgpHostname and a filter called 4bgpSpine for any notifications from spine-01. The result is that any info severity event messages from Spine-01 is filtered to the pd-netq-events channel.
cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
Successfully added/updated channel pd-netq-events
cumulus@switch:~$ netq add notification rule bgpHostname key node value spine-01
Successfully added/updated rule bgpHostname
cumulus@switch:~$ netq add notification filter bgpSpine rule bgpHostname channel pd-netq-events
Successfully added/updated filter bgpSpine
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: 1234567
890
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
bgpSpine 1 info pd-netq-events bgpHostnam
e
Create a Notification for Warnings on a Given EVPN VNI
This example creates a notification integration with a PagerDuty channel called pd-netq-events. It then creates a rule evpnVni and a filter called 3vni42 for any warning messages from VNI 42 on the EVPN overlay network. The result is that any event messages from VNI 42 with a severity level of ‘warning’ are filtered to the pd-netq-events channel.
cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
Successfully added/updated channel pd-netq-events
cumulus@switch:~$ netq add notification rule evpnVni key vni value 42
Successfully added/updated rule evpnVni
cumulus@switch:~$ netq add notification filter vni42 rule evpnVni channel pd-netq-events
Successfully added/updated filter vni42
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: 1234567
890
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
bgpSpine 1 info pd-netq-events bgpHostnam
e
vni42 2 warning pd-netq-events evpnVni
Create a Notification for Configuration File Changes
This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule sysconf and a filter called configChange for any configuration file update messages. The result is that any configuration update messages are filtered to the slk-netq-events channel.
cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
Successfully added/updated channel slk-netq-events
cumulus@switch:~$ netq add notification rule sysconf key configdiff value updated
Successfully added/updated rule sysconf
cumulus@switch:~$ netq add notification filter configChange severity info rule sysconf channel slk-netq-events
Successfully added/updated filter configChange
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- -------- ----------------------
slk-netq-events slack info webhook:https://hooks.s
lack.com/services/text/
moretext/evenmoretext
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
sysconf configdiff updated
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
bgpSpine 1 info pd-netq-events bgpHostnam
e
vni42 2 warning pd-netq-events evpnVni
configChange 3 info slk-netq-events sysconf
Create a Notification for When a Service Goes Down
This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule svcStatus and a filter called svcDown for any services state messages indicating a service is no longer operational. The result is that any service down messages are filtered to the slk-netq-events channel.
cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
Successfully added/updated channel slk-netq-events
cumulus@switch:~$ netq add notification rule svcStatus key new_status value down
Successfully added/updated rule svcStatus
cumulus@switch:~$ netq add notification filter svcDown severity error rule svcStatus channel slk-netq-events
Successfully added/updated filter svcDown
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- -------- ----------------------
slk-netq-events slack info webhook:https://hooks.s
lack.com/services/text/
moretext/evenmoretext
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
svcStatus new_status down
sysconf configdiff updated
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
bgpSpine 1 info pd-netq-events bgpHostnam
e
vni42 2 warning pd-netq-events evpnVni
configChange 3 info slk-netq-events sysconf
svcDown 4 error slk-netq-events svcStatus
Create a Filter to Drop Notifications from a Given Interface
This example creates a notification integration with a Slack channel called slk-netq-events. It then creates a rule swp52 and a filter called swp52Drop that drops all notifications for events from interface swp52.
cumulus@switch:~$ netq add notification channel slack slk-netq-events webhook https://hooks.slack.com/services/text/moretext/evenmoretext
Successfully added/updated channel slk-netq-events
cumulus@switch:~$ netq add notification rule swp52 key port value swp52
Successfully added/updated rule swp52
cumulus@switch:~$ netq add notification filter swp52Drop severity error rule swp52 before bgpSpine
Successfully added/updated filter swp52Drop
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- -------- ----------------------
slk-netq-events slack info webhook:https://hooks.s
lack.com/services/text/
moretext/evenmoretext
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
svcStatus new_status down
swp52 port swp52
sysconf configdiff updated
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
swp52Drop 1 error NetqDefaultChann swp52
el
bgpSpine 2 info pd-netq-events bgpHostnam
e
vni42 3 warning pd-netq-events evpnVni
configChange 4 info slk-netq-events sysconf
svcDown 5 error slk-netq-events svcStatus
Create a Notification for a Given Device that Has a Tendency to Overheat (Using Multiple Rules)
This example creates a notification when switch leaf04 has passed over the high temperature threshold. Two rules were necessary to create this notification, one to identify the specific device and one to identify the temperature trigger. NetQ then sends the message to the pd-netq-events channel.
cumulus@switch:~$ netq add notification channel pagerduty pd-netq-events integration-key 1234567890
Successfully added/updated channel pd-netq-events
cumulus@switch:~$ netq add notification rule switchLeaf04 key hostname value leaf04
Successfully added/updated rule switchLeaf04
cumulus@switch:~$ netq add notification rule overTemp key new_s_crit value 24
Successfully added/updated rule overTemp
cumulus@switch:~$ netq add notification filter critTemp rule switchLeaf04 channel pd-netq-events
Successfully added/updated filter critTemp
cumulus@switch:~$ netq add notification filter critTemp severity critical rule overTemp channel pd-netq-events
Successfully added/updated filter critTemp
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: 1234567
890
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
overTemp new_s_crit 24
svcStatus new_status down
switchLeaf04 hostname leaf04
swp52 port swp52
sysconf configdiff updated
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
swp52Drop 1 error NetqDefaultChann swp52
el
bgpSpine 2 info pd-netq-events bgpHostnam
e
vni42 3 warning pd-netq-events evpnVni
configChange 4 info slk-netq-events sysconf
svcDown 5 critical slk-netq-events svcStatus
critTemp 6 critical pd-netq-events switchLeaf
04
overTemp
View Notification Configurations in JSON Format
You can view configured integrations using the netq show notification commands. To view the channels, filters, and rules, run the three flavors of the command. Include the json option to display JSON-formatted output.
You might need to modify event notification configurations at some point in the lifecycle of your deployment. You can add channels, rules, filters, and a proxy at any time. You can remove channels, rules, and filters if they are not part of an existing notification configuration.
If you retire selected channels from a given notification application, you might want to remove them from NetQ as well. You can remove channels if they are not part of an existing notification configuration using the NetQ UI or the NetQ CLI.
To remove notification channels:
Click , and then click Notification Channels in the Notifications section.
This opens the Channels view.
Click the tab for the type of channel you want to remove (Slack, PagerDuty, Syslog, Email).
Select one or more channels.
Click .
To remove notification channels, run:
netq config del notification channel <text-channel-name-anchor>
This example removes a Slack integration and verifies it is no longer in
the configuration:
cumulus@switch:~$ netq del notification channel slk-netq-events
cumulus@switch:~$ netq show notification channel
Matching config_notify records:
Name Type Severity Channel Info
--------------- ---------------- ---------------- ------------------------
pd-netq-events pagerduty info integration-key: 1234567
890
Delete an Event Notification Rule
You might find after some experience with a given rule that you want to edit or remove the rule to better meet your needs. You can remove rules if they are not part of an existing notification configuration using the NetQ CLI.
To remove notification rules, run:
netq config del notification rule <text-rule-name-anchor>
This example removes a rule named swp52 and verifies it is no longer in
the configuration:
cumulus@switch:~$ netq del notification rule swp52
cumulus@switch:~$ netq show notification rule
Matching config_notify records:
Name Rule Key Rule Value
--------------- ---------------- --------------------
bgpHostname hostname spine-01
evpnVni vni 42
overTemp new_s_crit 24
svcStatus new_status down
switchLeaf04 hostname leaf04
sysconf configdiff updated
Delete an Event Notification Filter
You might find after some experience with a given filter that you want to edit or remove the filter to better meet your current needs. You can remove filters if they are not part of an existing notification configuration using the NetQ CLI.
To remove notification filters, run:
netq del notification filter <text-filter-name-anchor>
This example removes a filter named bgpSpine and verifies it is no longer in
the configuration:
cumulus@switch:~$ netq del notification filter bgpSpine
cumulus@switch:~$ netq show notification filter
Matching config_notify records:
Name Order Severity Channels Rules
--------------- ---------- ---------------- ---------------- ----------
swp52Drop 1 error NetqDefaultChann swp52
el
vni42 2 warning pd-netq-events evpnVni
configChange 3 info slk-netq-events sysconf
svcDown 4 critical slk-netq-events svcStatus
critTemp 5 critical pd-netq-events switchLeaf
04
overTemp
Delete an Event Notification Proxy
You can remove the proxy server by running the netq del notification proxy command. This changes the NetQ behavior to send events directly to the notification channels.
cumulus@switch:~$ netq del notification proxy
Successfully overwrote notifier proxy to null
Configure Threshold-Based Event Notifications
NetQ supports TCA events, which are a set of events that trigger at the crossing of a user-defined threshold. These events allow detection and prevention of network failures for selected ACL resources, digital optics, forwarding resources, interface errors and statistics, link flaps, resource utilization, and sensor events. You can find a complete list in the TCA Event Messages Reference.
A notification configuration must contain one rule. Each rule must contain a scope and a threshold. Optionally, you can specify an associated channel. If you want to deliver events to one or more notification channels (email, syslog, Slack, or PagerDuty), create them by following the instructions in Create a Channel, and then return here to define your rule.
If a rule is not associated with a channel, the event information is only reachable from the database.
Define a Scope
You use a scope to filter the events generated by a given rule. You set the scope values on a per TCA rule basis. You can filter all rules on the hostname. You can also filter some rules by other parameters.
Select Filter Parameters
For each event type, you can filter rules based on the following filter parameters.
Event ID
Scope Parameters
TCA_TCAM_IN_ACL_V4_FILTER_UPPER
Hostname
TCA_TCAM_EG_ACL_V4_FILTER_UPPER
Hostname
TCA_TCAM_IN_ACL_V4_MANGLE_UPPER
Hostname
TCA_TCAM_EG_ACL_V4_MANGLE_UPPER
Hostname
TCA_TCAM_IN_ACL_V6_FILTER_UPPER
Hostname
TCA_TCAM_EG_ACL_V6_FILTER_UPPER
Hostname
TCA_TCAM_IN_ACL_V6_MANGLE_UPPER
Hostname
TCA_TCAM_EG_ACL_V6_MANGLE_UPPER
Hostname
TCA_TCAM_IN_ACL_8021x_FILTER_UPPER
Hostname
TCA_TCAM_ACL_L4_PORT_CHECKERS_UPPER
Hostname
TCA_TCAM_ACL_REGIONS_UPPER
Hostname
TCA_TCAM_IN_ACL_MIRROR_UPPER
Hostname
TCA_TCAM_ACL_18B_RULES_UPPER
Hostname
TCA_TCAM_ACL_32B_RULES_UPPER
Hostname
TCA_TCAM_ACL_54B_RULES_UPPER
Hostname
TCA_TCAM_IN_PBR_V4_FILTER_UPPER
Hostname
TCA_TCAM_IN_PBR_V6_FILTER_UPPER
Hostname
Event ID
Scope Parameters
TCA_DOM_RX_POWER_ALARM_UPPER
Hostname, Interface
TCA_DOM_RX_POWER_ALARM_LOWER
Hostname, Interface
TCA_DOM_RX_POWER_WARNING_UPPER
Hostname, Interface
TCA_DOM_RX_POWER_WARNING_LOWER
Hostname, Interface
TCA_DOM_BIAS_CURRENT_ALARM_UPPER
Hostname, Interface
TCA_DOM_BIAS_CURRENT_ALARM_LOWER
Hostname, Interface
TCA_DOM_BIAS_CURRENT_WARNING_UPPER
Hostname, Interface
TCA_DOM_BIAS_CURRENT_WARNING_LOWER
Hostname, Interface
TCA_DOM_OUTPUT_POWER_ALARM_UPPER
Hostname, Interface
TCA_DOM_OUTPUT_POWER_ALARM_LOWER
Hostname, Interface
TCA_DOM_OUTPUT_POWER_WARNING_UPPER
Hostname, Interface
TCA_DOM_OUTPUT_POWER_WARNING_LOWER
Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_ALARM_UPPER
Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_ALARM_LOWER
Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_WARNING_UPPER
Hostname, Interface
TCA_DOM_MODULE_TEMPERATURE_WARNING_LOWER
Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_ALARM_UPPER
Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_ALARM_LOWER
Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_WARNING_UPPER
Hostname, Interface
TCA_DOM_MODULE_VOLTAGE_WARNING_LOWER
Hostname, Interface
Event ID
Scope Parameters
TCA_TCAM_TOTAL_ROUTE_ENTRIES_UPPER
Hostname
TCA_TCAM_TOTAL_MCAST_ROUTES_UPPER
Hostname
TCA_TCAM_MAC_ENTRIES_UPPER
Hostname
TCA_TCAM_ECMP_NEXTHOPS_UPPER
Hostname
TCA_TCAM_IPV4_ROUTE_UPPER
Hostname
TCA_TCAM_IPV4_HOST_UPPER
Hostname
TCA_TCAM_IPV6_ROUTE_UPPER
Hostname
TCA_TCAM_IPV6_HOST_UPPER
Hostname
Event ID
Description
TCA_HW_IF_OVERSIZE_ERRORS
Hostname, Interface
TCA_HW_IF_UNDERSIZE_ERRORS
Hostname, Interface
TCA_HW_IF_ALIGNMENT_ERRORS
Hostname, Interface
TCA_HW_IF_JABBER_ERRORS
Hostname, Interface
TCA_HW_IF_SYMBOL_ERRORS
Hostname, Interface
Event ID
Scope Parameters
TCA_RXBROADCAST_UPPER
Hostname, Interface
TCA_RXBYTES_UPPER
Hostname, Interface
TCA_RXMULTICAST_UPPER
Hostname, Interface
TCA_TXBROADCAST_UPPER
Hostname, Interface
TCA_TXBYTES_UPPER
Hostname, Interface
TCA_TXMULTICAST_UPPER
Hostname, Interface
Event ID
Description
TCA_LINK
Hostname, Interface
Event ID
Scope Parameters
TCA_CPU_UTILIZATION_UPPER
Hostname
TCA_DISK_UTILIZATION_UPPER
Hostname
TCA_MEMORY_UTILIZATION_UPPER
Hostname
Event ID
Scope Parameters
Tx CNP Unicast No Buffer Discard
Hostname, Interface
Rx RoCE PFC Pause Duration
Hostname
Rx RoCE PG Usage Cells
Hostname, Interface
Tx RoCE TC Usage Cells
Hostname, Interface
Rx RoCE No Buffer Discard
Hostname, Interface
Tx RoCE PFC Pause Duration
Hostname, Interface
Tx CNP Buffer Usage Cells
Hostname, Interface
Tx ECN Marked Packets
Hostname, Interface
Tx RoCE PFC Pause Packets
Hostname, Interface
Rx CNP No Buffer Discard
Hostname, Interface
Rx CNP PG Usage Cells
Hostname, Interface
Tx CNP TC Usage Cells
Hostname, Interface
Rx RoCE Buffer Usage Cells
Hostname, Interface
Tx RoCE Unicast No Buffer Discard
Hostname, Interface
Rx CNP Buffer Usage Cells
Hostname, Interface
Rx RoCE PFC Pause Packets
Hostname, Interface
Tx RoCE Buffer Usage Cells
Hostname, Interface
Event ID
Scope Parameters
TCA_SENSOR_FAN_UPPER
Hostname, Sensor Name
TCA_SENSOR_POWER_UPPER
Hostname, Sensor Name
TCA_SENSOR_TEMPERATURE_UPPER
Hostname, Sensor Name
TCA_SENSOR_VOLTAGE_UPPER
Hostname, Sensor Name
Event ID
Scope Parameters
TCA_WJH_DROP_AGG_UPPER
Hostname, Reason
TCA_WJH_ACL_DROP_AGG_UPPER
Hostname, Reason, Ingress port
TCA_WJH_BUFFER_DROP_AGG_UPPER
Hostname, Reason
TCA_WJH_SYMBOL_ERROR_UPPER
Hostname, Port down reason
TCA_WJH_CRC_ERROR_UPPER
Hostname, Port down reason
Specify the Scope
Rules require a scope. The scope can be the entire complement of monitored devices or a subset. You define scopes as regular expressions, and they appear as regular expressions in NetQ. Each event has a set of attributes you can use to apply the rule to a subset of all devices. The definition and display is slightly different between the NetQ UI and the NetQ CLI, but the results are the same.
You define the scope in the Choose Attributes step when creating a TCA event rule. You can choose to apply the rule to all devices or narrow the scope using attributes. If you choose to narrow the scope, but then do not enter any values for the available attributes, the result is all devices and attributes.
Scopes appear in TCA rule cards using the following format: Attribute, Operation, Value.
In this example, three attributes are available. For one or more of these attributes, select the operation (equals or starts with) and enter a value. For drop reasons, click in the value field to open a list of reasons, and select one from the list.
Note that you should leave the drop type attribute blank.
Create rule to show events from a …
Attribute
Operation
Value
Single device
hostname
Equals
<hostname> such as spine01
Single interface
ifname
Equals
<interface-name> such as swp6
Single sensor
s_name
Equals
<sensor-name> such as fan2
Single WJH drop reason
reason or port_down_reason
Equals
<drop-reason> such as WRED
Single WJH ingress port
ingress_port
Equals
<port-name> such as 47
Set of devices
hostname
Starts with
<partial-hostname> such as leaf
Set of interfaces
ifname
Starts with
<partial-interface-name> such as swp or eth
Set of sensors
s_name
Starts with
<partial-sensor-name> such as fan, temp, or psu
Refer to WJH Event Messages Reference for WJH drop types and reasons. Leaving an attribute value blank defaults to all; all hostnames, interfaces, sensors, forwarding resources, ACL resources, and so forth.
Each attribute is displayed on the rule card as a regular expression equivalent to your choices above:
Equals is displayed as an equals sign (=)
Starts with is displayed as a caret (^)
Blank (all) is displayed as an asterisk (*)
Scopes are defined with regular expressions. When more than one scoping parameter is available, they must be separated by a comma (without spaces), and all parameters must be defined in order. When an asterisk (*) is used alone, it must be entered inside either single or double quotes. Single quotes are used here.
The single hostname scope parameter is used by the ACL resources, forwarding resources, and resource utilization events.
Scope Value
Example
Result
<hostname>
leaf01
Deliver events for the specified device
<partial-hostname>*
leaf*
Deliver events for devices with hostnames starting with specified text (leaf)
The hostname and interface scope parameters are used by the digital optics, interface errors, interface statistics, and link flaps events.
Scope Value
Example
Result
<hostname>,<interface>
leaf01,swp9
Deliver events for the specified interface (swp9) on the specified device (leaf01)
<hostname>,'*'
leaf01,'*'
Deliver events for all interfaces on the specified device (leaf01)
'*',<interface>
'*',swp9
Deliver events for the specified interface (swp9) on all devices
<partial-hostname>*,<interface>
leaf*,swp9
Deliver events for the specified interface (swp9) on all devices with hostnames starting with the specified text (leaf)
<hostname>,<partial-interface>*
leaf01,swp*
Deliver events for all interface with names starting with the specified text (swp) on the specified device (leaf01)