gNMI Streaming

You can use gRPC Network Management Interface (gNMI) to collect system resource, interface, and counter information from Cumulus Linux and export it to your own gNMI client.

Configure the gNMI Agent

The gNMI agent is disabled by default. To enable it, run:

 cumulus@switch:~$ netq config add agent gnmi-enable true

The gNMI agent listens over port 9339. You can change the default port in case you use that port in another application. The /etc/netq/netq.yml file stores the configuration.

Use the following commands to adjust the settings:

  1. Disable the gNMI agent:

    cumulus@switch:~$ netq config add agent gnmi-enable false
    
  2. Change the default port over which the gNMI agent listens:

    cumulus@switch:~$ netq config add agent gnmi-port <gnmi_port>
    
  3. Restart the NetQ agent to incorporate the configuration changes:

    cumulus@switch:~$ netq config restart agent
    

Use the gNMI Agent Only

NVIDIA recommends collecting data with both the gNMI and NetQ agents. However, if you do not want to collect data with both agents, you can disable the NetQ agent. Data is then sent exclusively to the gNMI agent.

To disable the NetQ agent, use the following command:

cumulus@switch:~$ netq config add agent opta-enable false

You cannot disable both the NetQ and gNMI agents. If both agents are enabled on Cumulus Linux and a NetQ server is unreachable, the data from the following models are not sent to gNMI:

  • openconfig-interfaces
  • openconfig-if-ethernet
  • openconfig-if-ethernet-ext
  • openconfig-system
  • nvidia-if-ethernet-ext

WJH, openconfig-platform, and openconfig-lldp data continue streaming to gNMI in this state. If you are only using gNMI and a NetQ telemetry server does not exist, you should disable the NetQ agent by setting opta-enable to false.

Supported Models

Cumulus Linux supports the following OpenConfig models:

ModelSupported Data
openconfig-interfacesName, Operstatus, AdminStatus, IfIndex, MTU, LoopbackMode, Enabled, Counters (InPkts, OutPkts, InOctets, InUnicastPkts, InDiscards, InMulticastPkts, InBroadcastPkts, InErrors, OutOctets, OutUnicastPkts, OutMulticastPkts, OutBroadcastPkts, OutDiscards, OutErrors)
openconfig-if-ethernetAutoNegotiate, PortSpeed, MacAddress, NegotiatedPortSpeed, Counters (InJabberFrames, InOversizeFrames,​ InUndersizeFrames)
openconfig-if-ethernet-extFrame size counters (InFrames_64Octets, InFrames_65_127Octets, InFrames_128_255Octets, InFrames_256_511Octets, InFrames_512_1023Octets, InFrames_1024_1518Octets)
openconfig-systemMemory, CPU
openconfig-platformPlatform data (Name, Description, Version)
openconfig-lldpLLDP data (PortIdType, PortDescription, LastUpdate, SystemName, SystemDescription, ChassisId, Ttl, Age, ManagementAddress, ManagementAddressType, Capability)

gNMI clients can also use the following NVIDIA models:

ModelSupported Data
nvidia-if-wjh-drop-aggregateAggregated WJH drops, including L1, L2, router, ACL, tunnel, and buffer drops
nvidia-if-ethernet-extExtended Ethernet counters (AlignmentError, InAclDrops, InBufferDrops, InDot3FrameErrors, InDot3LengthErrors, InL3Drops, InPfc0Packets, InPfc1Packets, InPfc2Packets, InPfc3Packets, InPfc4Packets, InPfc5Packets, InPfc6Packets, InPfc7Packets, OutNonQDrops, OutPfc0Packets, OutPfc1Packets, OutPfc2Packets, OutPfc3Packets, OutPfc4Packets, OutPfc5Packets, OutPfc6Packets, OutPfc7Packets, OutQ0WredDrops, OutQ1WredDrops, OutQ2WredDrops, OutQ3WredDrops, OutQ4WredDrops, OutQ5WredDrops, OutQ6WredDrops, OutQ7WredDrops, OutQDrops, OutQLength, OutWredDrops, SymbolErrors, OutTxFifoFull)

The client should use the following YANG models as a reference:

nvidia-if-ethernet-ext
nvidia-if-wjh-drop-aggregate

Collect WJH Data Using gNMI

You can export What Just Happened data from the NetQ agent to your own gNMI client. Refer to the previous section for the nvidia-if-wjh-drop-aggregate reference YANG model.

Supported Features

  • The gNMI agent supports capability and stream subscribe requests for WJH events.
  • If you are using SONiC, WJH data can only be collected using gNMI.

WJH Drop Reasons

The data NetQ sends to the gNMI agent is in the form of WJH drop reasons. The reasons are generated by the SDK and are stored in the /usr/etc/wjh_lib_conf.xml file on the switch. Use this file as a guide to filter for specific reason types (L1, ACL, and so forth), reason IDs, or event severities.

L1 Drop Reasons

Reason IDReasonDescription
10021Port admin downValidate port configuration
10022Auto-negotiation failureSet port speed manually, disable auto-negotiation
10023Logical mismatch with peer linkCheck cable/transceiver
10024Link training failureCheck cable/transceiver
10025Peer is sending remote faultsReplace cable/transceiver
10026Bad signal integrityReplace cable/transceiver
10027Cable/transceiver is not supportedUse supported cable/transceiver
10028Cable/transceiver is unpluggedPlug cable/transceiver
10029Calibration failureCheck cable/transceiver
10030Cable/transceiver bad statusCheck cable/transceiver
10031Other reasonOther L1 drop reason

L2 Drop Reasons

Reason IDReasonSeverityDescription
201MLAG port isolationNoticeExpected behavior
202Destination MAC is reserved (DMAC=01-80-C2-00-00-0x)ErrorBad packet was received from the peer
203VLAN tagging mismatchErrorValidate the VLAN tag configuration on both ends of the link
204Ingress VLAN filteringErrorValidate the VLAN membership configuration on both ends of the link
205Ingress spanning tree filterNoticeExpected behavior
206Unicast MAC table action discardErrorValidate MAC table for this destination MAC
207Multicast egress port list is emptyWarningValidate why IGMP join or multicast router port does not exist
208Port loopback filterErrorValidate MAC table for this destination MAC
209Source MAC is multicastErrorBad packet was received from peer
210Source MAC equals destination MACErrorBad packet was received from peer

Router Drop Reasons

Reason IDReasonSeverityDescription
301Non-routable packetNoticeExpected behavior
302Blackhole routeWarningValidate routing table for this destination IP
303Unresolved neighbor/next hopWarningValidate ARP table for the neighbor/next hop
304Blackhole ARP/neighborWarningValidate ARP table for the next hop
305IPv6 destination in multicast scope FFx0:/16NoticeExpected behavior - packet is not routable
306IPv6 destination in multicast scope FFx1:/16NoticeExpected behavior - packet is not routable
307Non-IP packetNoticeDestination MAC is the router, packet is not routable
308Unicast destination IP but multicast destination MACErrorBad packet was received from the peer
309Destination IP is loopback addressErrorBad packet was received from the peer
310Source IP is multicastErrorBad packet was received from the peer
311Source IP is in class EErrorBad packet was received from the peer
312Source IP is loopback addressErrorBad packet was received from the peer
313Source IP is unspecifiedErrorBad packet was received from the peer
314Checksum or IPver or IPv4 IHL too shortErrorBad cable or bad packet was received from the peer
315Multicast MAC mismatchErrorBad packet was received from the peer
316Source IP equals destination IPErrorBad packet was received from the peer
317IPv4 source IP is limited broadcastErrorBad packet was received from the peer
318IPv4 destination IP is local network (destination=0.0.0.0/8)ErrorBad packet was received from the peer
320Ingress router interface is disabledWarningValidate your configuration
321Egress router interface is disabledWarningValidate your configuration
323IPv4 routing table (LPM) unicast missWarningValidate routing table for this destination IP
324IPv6 routing table (LPM) unicast missWarningValidate routing table for this destination IP
325Router interface loopbackWarningValidate the interface configuration
326Packet size is larger than router interface MTUWarningValidate the router interface MTU configuration
327TTL value is too smallWarningActual path is longer than the TTL

Tunnel Drop Reasons

Reason IDReasonSeverityDescription
402Overlay switch - Source MAC is multicastErrorThe peer sent a bad packet
403Overlay switch - Source MAC equals destination MACErrorThe peer sent a bad packet
404Decapsulation errorErrorThe peer sent a bad packet

ACL Drop Reasons

Reason IDReasonSeverityDescription
601Ingress port ACLNoticeValidate ACL configuration
602Ingress router ACLNoticeValidate ACL configuration
603Egress router ACLNoticeValidate ACL configuration
604Egress port ACLNoticeValidate ACL configuration

Buffer Drop Reasons

Reason IDReasonSeverityDescription
503Tail dropWarningMonitor network congestion
504WREDWarningMonitor network congestion
505Port TC congestion threshold crossedNoticeMonitor network congestion
506Packet latency threshold crossedNoticeMonitor network congestion

gNMI Client Requests

You can use your gNMI client on a host server to request capabilities and data that the agent is subscribed to.

The following example shows a gNMI client request for interface speed:

gnmi_client -target_addr 10.209.37.121:9339 -xpath "/interfaces/interface[name=swp1]/ethernet/state/port-speed" -once
{
   "Response": {
      "Update": {
         "update": [
            {
               "val": {
                  "Value": {
                     "StringVal": "SPEED_40GB"
                  }
               },
               "path": {
                  "elem": [
                     {
                        "name": "state"
                     },
                     {
                        "name": "port-speed"
                     }
                  ]
               }
            }
         ],
         "timestamp": 1636910588085654861,
         "prefix": {
            "target": "netq",
            "elem": [
               {
                  "name": "interfaces"
               },
               {
                  "name": "interface",
                  "key": {
                     "name": "swp1"
                  }
               },
               {
                  "name": "ethernet"
               }
            ]
         }
      }
   }
}


The following example shows a gNMI client request for WJH drop data:

gnmi_client -target_addr 10.209.37.121:9339 -xpath "/interfaces/interface[name=swp8]/wjh/aggregate/l2/reasons/reason[id=210]"
{
   "Response": {
      "Update": {
         "update": [
            {
               "val": {
                  "Value": {
                     "StringVal": "[{
									  "IngressPort": "swp8",
									  "DropType": "L2",
									  "Reason": "Source MAC equals destination MAC",
									  "Severity": "Error",
									  "Smac": "00:02:10:00:00:01",
									  "Dmac": "00:02:10:00:00:01",
									  "Proto": 6,
									  "Sport": 15,
									  "Dport": 16,
									  "Sip": "1.1.1.1"
									  "Dip": "2.2.2.2",
									  "AggCount": 192,
									  "FirstTimestamp": 1636907412,
									  "EndTimestamp": 1636907432,
								   }]"

                  }
               },
               "path": {
                  "elem": [
                     {
                        "name": "state"
                     },
                     {
                        "name": "drop"
                     }
                  ]
               }
            }
         ],
         "prefix": {
            "elem": [
               {
                  "name": "interfaces"
               },
               {
                  "key": {
                     "name": "swp8"
                  },
                  "name": "interface"
               },
               {
                  "name": "wjh"
               },
               {
                  "name": "aggregate"
               },
               {
                  "name": "l2"
               },
               {
                  "name": "reasons"
               },
               {
                  "key" : {
                     "severity": "error",
                     "id": "210"
                  },
                  "name" : "reason"
               }
            ],
            "target": "netq"
         },
         "timestamp": 1636907442362981645
      }
   }
}