Open Telemetry Export

Telemetry enables you to collect, send, and analyze large amounts of data, such as traffic statistics, port status, device health and configuration, and events. This data helps you monitor switch performance, health and behavior, traffic patterns, and QoS.

Cumulus Linux supports open telemetry (OTEL) export. You can use OTLP to export metrics, such as interface counters and histogram collection data to an external collector for analysis and visualization.

  • Cumulus Linux supports open telemetry export on switches with Spectrum-4 ASIC only.
  • Open telemetry export is a beta feature.

Configure Open Telemetry

To enable open telemetry:

cumulus@switch:~$ nv set system telemetry export otlp state enabled 
cumulus@switch:~$ nv config apply

You can enable open telemetry for interface statistics, histogram collection, or both:

cumulus@switch:~$ nv set system telemetry interface-stats export state enabled
cumulus@switch:~$ nv config apply
cumulus@switch:~$ nv set system telemetry histogram export state enabled
cumulus@switch:~$ nv config apply

  • When you enable open telemetry for interface statistics, the switch exports counters on all configured interfaces.
  • When you enable open telemetry for histogram data, your histogram collection configuration defines the data that the switch exports.

You can enable additional interface statistic collection per interface for specific ingress buffer traffic classes (0 through 15) and egress buffer priority groups (0 through 7). When you enable these settings, the switch exports interface_pg and interface_tc counters for the defined priority groups and traffic classes:

cumulus@switch:~$ nv set system telemetry interface-stats ingress-buffer priority-group 4
cumulus@switch:~$ nv set system telemetry interface-stats egress-buffer traffic-class 12
cumulus@switch:~$ nv config apply

You can adjust the interface statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value 1.

cumulus@switch:~$ nv set system telemetry interface-stats sample-interval 100
cumulus@switch:~$ nv config apply

gRPC OTLP Export

To configure open telemetry export:

  1. Configure gRPC to communicate with the collector by providing the collector destination IP address or hostname. Specify the port to use for communication if it is different from the default port 8443:

    cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.100 port 4317
    cumulus@switch:~$ nv config apply
    
  2. Configure an X.509 certificate to secure the gRPC connection:

    cumulus@switch:~$ nv set system telemetry export otlp grpc cert-id <certificate>
    cumulus@switch:~$ nv config apply
    

By default, OTLP export is in secure mode that requires a certificate. For connections without a configured certificate, you must enable insecure mode with the nv set system telemetry export otlp grpc insecure enabled command.

Show Telemetry Export Configuration

To show the telemetry export configuration, run the nv show telemetry export command:

cumulus@switch:~$ nv show system telemetry export
                    applied   pending 
------------------  --------  --------
vrf                 default   default 
otlp                                  
  state             disabled  disabled
  grpc                                
    insecure  disabled  disabled
    port            8443      8443    
    [destination]             

To show the OTLP gRPC destination configuration, run the nv show system telemetry export otlp grpc destination command.

Telemetry Data Format

Cumulus Linux exports interface statistic and histogram data in the following format.

Interface Statistics

The interface statistic data samples that the switch exports to the OTEL collector are gauge streams that include the interface name as an attribute and the statistics value reported in the asDouble exemplar.

The following table describes the interface statistics:

NameDescription
nvswitch_interface_oper_stateInterface operational state as a bitmap: (None[0], Up[1], Down[2], Invalid[4], Error[8])
nvswitch_interface_dot3_control_in_unknown_opcodesInput 802.3 unknown opcode counter.
nvswitch_interface_dot3_in_pause_framesInput 802.3 pause frame counter.
nvswitch_interface_dot3_out_pause_framesOutput 802.3 pause frame counter.
nvswitch_interface_dot3_stats_alignment_errors802.3 alignment error counter.
nvswitch_interface_dot3_stats_carrier_sense_errors802.3 interface carrier sense error counter.
nvswitch_interface_dot3_stats_deferred_transmissions802.3 deferred transmission counter.
nvswitch_interface_dot3_stats_excessive_collisions802.3 excessive collisions counter.
nvswitch_interface_dot3_stats_fcs_errors802.3 FCS error counter.
nvswitch_interface_dot3_stats_frame_too_longs802.3 excessive frame size counter.
nvswitch_interface_dot3_stats_internal_mac_receive_errors802.3 internal MAC receive error counter.
nvswitch_interface_dot3_stats_internal_mac_transmit_errors802.3 internal MAC transmit error counter.
nvswitch_interface_dot3_stats_late_collisions802.3 late collisions counter.
nvswitch_interface_dot3_stats_multiple_collision_frames802.3 multiple collision frames counter.
nvswitch_interface_dot3_stats_single_collision_frames802.3 single collision frames counter.
nvswitch_interface_dot3_stats_sqe_test_errors802.3 SQE test error counter.
nvswitch_interface_dot3_stats_symbol_errors802.3 symbol error counter.
nvswitch_interface_pg_rx_buffer_discardInterace ingress priority group receive buffer discard counter.
nvswitch_interface_pg_rx_framesInterface ingress priority group receive frames counter.
nvswitch_interface_pg_rx_octetInterface ingress priority group receive bytes counter.
nvswitch_interface_pg_rx_shared_buffer_discardInterface ingress priority group receive shared buffer discard counter.
nvswitch_interface_tc_tx_bc_framesInterface egress traffic class transmit broadcast frames counter.
nvswitch_interface_tc_tx_ecn_marked_tcInterface egress traffic class transmit ECN marked counter.
nvswitch_interface_tc_tx_framesInterface egress traffic class trasmit frames counter.
nvswitch_interface_tc_tx_mc_framesInterface egress traffic class trasmit multicast frames counter.
nvswitch_interface_tc_tx_no_buffer_discard_ucInterface egress traffic class transmit unicast no buffer discard counter.
nvswitch_interface_tc_tx_octetInterface egress traffic class transmit bytes counter.
nvswitch_interface_tc_tx_queueInterface egress traffic class transmit queue counter.
nvswitch_interface_tc_tx_uc_framesInterface egress traffic class transmit unicast frames counter.
nvswitch_interface_tc_tx_wred_discardInterface egress traffic class transmit WRED discard counter.

Example JSON data for interface_oper_state:

Example JSON data for interface_dot3_stats_fcs_errors:

Histogram Data

The histogram data samples that the switch exports to the OTEL collector are histogram data points that include the histogram bucket (bin) counts and the respective queue length size boundaries for each bucket.

The switch sends a sample with the following names for each interface enabled for ingress and egress buffer histogram collection:

NameDescription
nvswitch_histogram_interface_egress_bufferHistogram interface egress buffer queue depth.
nvswitch_histogram_interface_ingress_bufferHistogram interface ingress buffer queue depth.

Example JSON data for interface_ingress_buffer:

Example JSON data for interface_egress_buffer: