Open Telemetry Export

Telemetry enables you to collect, send, and analyze large amounts of data, such as traffic statistics, port status, device health and configuration, and events. This data helps you monitor switch performance, health and behavior, traffic patterns, and QoS.

Configure Open Telemetry

Cumulus Linux supports open telemetry (OTEL) export. You can use OTLP to export metrics, such as interface counters, buffer statistics, histogram collection, platform statistics, routing metrics, and systemd statistics to an external collector for analysis and visualization.

  • Cumulus Linux supports open telemetry export on switches with the Spectrum-2 ASIC and later.
  • When you enable and use Open Telemetry, do not enable and use gNMI streaming.

To enable open telemetry:

cumulus@switch:~$ nv set system telemetry export otlp state enabled 
cumulus@switch:~$ nv config apply

When you enable open telemetry, the switch collects and exports system information metrics to the configured external collector by default. In addition, you can enable open telemetry to collect and export interface statistics, buffer statistics, histogram data, control plane statistics, platform statistics, and routing metrics.

Adaptive Routing Statistics

When you enable open telemetry for adaptive routing, the switch exports adaptive routing statistics:

cumulus@switch:~$ nv set system telemetry adaptive-routing-stats export state enabled
cumulus@switch:~$ nv config apply

You can adjust the adaptive routing statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default setting is 1 second.

cumulus@switch:~$ nv set system telemetry adaptive-routing-stats sample-interval 40
cumulus@switch:~$ nv config apply

To export adaptive routing metrics, you must enable the adaptive routing feature.

Buffer Statistics

When you enable open telemetry for buffer statistics, the switch exports interface and switch buffer occupancy and watermark metrics.

cumulus@switch:~$ nv set system telemetry buffer-stats export state enabled 
cumulus@switch:~$ nv config apply

To show buffer statistics configuration, run the nv show system telemetry buffer-stats command.

Control Plane Statistics

When you enable open telemetry for control plane statistics, the switch exports additional counters for control plane packets:

cumulus@switch:~$ nv set system telemetry control-plane-stats export state enabled
cumulus@switch:~$ nv config apply

You can adjust the control plane statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value is 1.

cumulus@switch:~$ nv set system telemetry control-plane-stats sample-interval 100
cumulus@switch:~$ nv config apply

To show control plane statistics configuration, run the nv show system telemetry control-plane-stats command.

Histogram Data

When you enable open telemetry for histogram data, your buffer, counter, and latency histogram collection configuration defines the data that the switch exports:

cumulus@switch:~$ nv set system telemetry histogram export state enabled
cumulus@switch:~$ nv config apply

Temporality Mode

Histogram temporality mode lets you choose how to aggregate and report histogram data over time.

Cumulus Linux supports the following temporality modes:

  • Delta mode captures only the new data recorded after the last export, reflecting the rate of change instead of cumulative totals. Each export includes only the counts collected within the latest time window; previous values do not carry over to the next reporting cycle. This is the default setting.
  • Cumulative mode reports the total count from the beginning of the measurement period. Each export includes all previously reported values along with newly recorded data, ensuring that the metric continues to grow until the measurement cycle resets. This approach prevents the metric from resetting between reports.

Changing the temporality mode:

  • Impacts both snapshot file collection and metric data export.
  • Restarts the histogram service, initiating a new measurement cycle.

To change the temporality mode, run the nv set system telemetry histogram temporality <mode> command. The following command sets the temporality mode to cumulative:

cumulus@switch:~$ nv set system telemetry histogram temporality cumulative
cumulus@switch:~$ nv config apply

To reset the temporality mode to the default value (delta), run the nv unset system telemetry histogram temporality command or set the mode to delta with the nv set system telemetry histogram temporality delta command.

To show histogram data configuration, run the nv show system telemetry histogram command.

Interface Statistics

When you enable open telemetry for interface statistics, the switch exports interface statistics on all configured interfaces:

cumulus@switch:~$ nv set system telemetry interface-stats export state enabled
cumulus@switch:~$ nv config apply

You can adjust the interface statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value is 1.

cumulus@switch:~$ nv set system telemetry interface-stats sample-interval 100
cumulus@switch:~$ nv config apply

You can enable these additional interface statistics:

  • Traffic Class and Switch Priority metrics for ingress buffer traffic classes (0 through 15) and egress buffer priority groups (0 through 7)
  • PHY for interface PHY metrics

When you enable these settings, the switch exports interface_pg and interface_tc counters for the defined priority groups and traffic classes:

cumulus@switch:~$ nv set system telemetry interface-stats ingress-buffer priority-group 4
cumulus@switch:~$ nv set system telemetry interface-stats egress-buffer traffic-class 12
cumulus@switch:~$ nv config apply

You can enable additional switch priority interface statistic collection on all configured interfaces for specific switch priority values:

cumulus@switch:~$ nv set system telemetry interface-stats switch-priority 4
cumulus@switch:~$ nv config apply

When you enable this setting, the switch exports nvswitch_interface_phy and nvswitch_interface_raw interface PHY counters:

cumulus@switch:~$ nv set system telemetry interface-stats class phy state enabled
cumulus@switch:~$ nv config apply

To show interface statistics configuration, run the nv show system telemetry interface-stats command.

LLDP Statistics

When you enable LLDP statistic open telemetry, the switch exports neighbor, port, and chassis information. LLDP metrics are useful to get a network topology to optimize and efficiently use network resources. Knowing the network layout makes it easier to configure network devices. To enable the LLDP statistics:

cumulus@switch:~$ nv set system telemetry lldp export state enabled
cumulus@switch:~$ nv config apply

You can adjust the LLDP statistics sample interval (in seconds). You can specify a value between 1 and 86400. The default value is 5.

cumulus@switch:~$ nv set system telemetry lldp sample-interval 10
cumulus@switch:~$ nv config apply

Platform Statistics

When you enable platform statistic open telemetry, the switch exports data about the CPU, disk, filesystem, memory, sensor health, and transceiver information. To enable all platform statistics globally:

cumulus@switch:~$ nv set system telemetry platform-stats export state enabled
cumulus@switch:~$ nv config apply

If you do not want to enable all platform statistics, you can enable or disable individual platform telemetry components or adjust the sample interval for individual components. The default sample interval is 60 seconds.

To enable CPU statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class cpu state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for CPU statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class cpu sample-interval 100
cumulus@switch:~$ nv config apply

To enable disk statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class disk state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for disk statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class disk sample-interval 100
cumulus@switch:~$ nv config apply

To enable filesystem statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class file-system state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for filesystem statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class file-system sample-interval 100
cumulus@switch:~$ nv config apply

To enable memory statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class memory state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for memory statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class memory sample-interval 100
cumulus@switch:~$ nv config apply

To enable environment sensor statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class environment-sensors state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for environment sensor statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class environment-sensors sample-interval 100
cumulus@switch:~$ nv config apply

To enable transceiver statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class transceiver-info state enabled
cumulus@switch:~$ nv config apply

To adjust the sample interval for transceiver statistics:

cumulus@switch:~$ nv set system telemetry platform-stats class transceiver-info sample-interval 40
cumulus@switch:~$ nv config apply

To show platform statistics configuration, run the nv show system telemetry platform-stats command.

Routing Metrics

To enable open telemetry for layer 3 routing metrics, enable the OTEL routing service:

cumulus@switch:~$ nv set system telemetry router export state enabled
cumulus@switch:~$ nv config apply

To export any of the routing metrics, you must first enable the OTEL routing service.

To enable collection and export of BGP peer state statistics across all VRFs:

cumulus@switch:~$ nv set system telemetry router bgp export state enabled
cumulus@switch:~$ nv config apply

To enable collection and export of statistics for all BGP peers under a specific VRF, run the nv set system telemetry router vrf <vrf-id> bgp export state enabled command; for example:

cumulus@switch:~$ nv set system telemetry router vrf RED bgp export state enabled
cumulus@switch:~$ nv config apply

To enable collection and export of statistics for a specific BGP peer under a specific VRF, run the nv set system telemetry router vrf <vrf-id> bgp peer <peer-id> export state enabled command; for example:

cumulus@switch:~$ nv set system telemetry router vrf RED bgp peer swp1 export state enabled
cumulus@switch:~$ nv config apply

To enable collection and export of statistics for the IP routing table across all VRFs:

cumulus@switch:~$ nv set system telemetry router rib export state enabled
cumulus@switch:~$ nv config apply 

To enable collection and export of statistics for the IP routing table for a specific VRF, run the nv set system telemetry router vrf <vrf-id> rib export state enabled command; for example::

cumulus@switch:~$ nv set system telemetry router vrf RED rib export state enabled
cumulus@switch:~$ nv config apply

You can adjust the sample interval (in seconds) for routing metrics. You can specify a value in multiples of 10 up to 60. The default value is 30.

cumulus@switch:~$ nv set system telemetry router sample-interval 40
cumulus@switch:~$ nv config apply

  • You can disable BGP export across all VRFs with the nv set telemetry router bgp export state disabled command and enable it only for specific VRFs with the nv set telemetry router vrf <vrf-name> bgp export state enabled command.
  • You can also disable BGP export across all peers in a VRF with the nv set telemetry router vrf <vrf-name> bgp export state disabled command, and enable telemetry only for specific peers in the VRF with the nv set telemetry router vrf <vrf-name> bgp peer <peer> export state enabled command.

To show routing metrics configuration settings, run the nv show system telemetry router command.

Software Statistics

Software statistics telemetry currently includes systemd unit metrics. When you enable systemd metrics, the switch exports unit-level metrics, efficiently reporting only relevant data depending on the service state to ensure minimal performance overhead. You can collect additional systemd metrics by enabling process-level metrics.

To enable systemd unit metrics:

cumulus@switch:~$ nv set system telemetry software-stats systemd export state enabled
cumulus@switch:~$ nv config apply

To enable systemd process-level statistics:

cumulus@switch:~$ nv set system telemetry software-stats systemd process-level enabled 
cumulus@switch:~$ nv config apply

You can adjust the software routing statistics sample interval (in seconds). You can specify a value between 60 and 86400. The default setting is 60 seconds.

cumulus@switch:~$ nv set system telemetry software-stats systemd sample-interval 100
cumulus@switch:~$ nv config apply

By default, the switch collects statistics for the following units:

Units collected by default

If a systemd unit is not active, the switch only collects the unit state, reducing unnecessary data processing.

You can configure custom profiles to collect statistics for specific units. To configure a custom profile, run the nv set system telemetry software-stats systemd unit-profile <profile-name> unit <unit> command to provide a custom profile name and the unit you want to monitor. You must then set the custom profile you want to use as the active profile. You can configure multiple units in a custom profile. Only one profile can be active at a time.

The following example configures a custom profile called CUSTOM1 that collects statistics about the NGINX unit and the NVUE unit, and a custom profile called CUSTOM2 that collects statistics about the FRR unit. The example then sets CUSTOM2 as the active profile:

cumulus@switch:~$ nv set system telemetry software-stats systemd unit-profile CUSTOM1 unit nginx.service
cumulus@switch:~$ nv set system telemetry software-stats systemd unit-profile CUSTOM1 unit nvued.service
cumulus@switch:~$ nv set system telemetry software-stats systemd unit-profile CUSTOM2 unit frr.service
cumulus@switch:~$ nv set system telemetry software-stats systemd active-profile CUSTOM2
cumulus@switch:~$ nv config apply

To show systemd software statistics configuration, run the nv show system telemetry software-stats systemd command:

cumulus@switch:~$ nv show system telemetry software-stats systemd 
                 applied 
---------------  --------
sample-interval  100     
process-level    disabled
active-profile   CUSTOM2 
export                   
  state          enabled 
[unit-profile]   CUSTOM1 
[unit-profile]   CUSTOM2 
[unit-profile]   default

To show the default profile and all configured custom profiles, run the nv show system telemetry software-stats systemd unit-profile command:

cumulus@switch:~$ nv show system telemetry software-stats systemd unit-profile 
         Summary                               
-------  --------------------------------------
CUSTOM1  unit:                    nginx.service
         unit:                    nvued.service
CUSTOM2  unit:                    frr.service
default  unit:             asic-monitor.service
         unit:                      frr.service
         unit:                  hostapd.service
         unit:       hw-management-sync.service
         unit:               netq-agent.service
         unit:                    netqd.service
         unit:                    nginx.service
         unit:                   ntpsec.service
         unit:             nv-telemetry.service
         unit:                    nvued.service
         unit: prometheus-node-exporter.service
         unit:     prometheus-sdk-stats.service
         unit:                    ptp4l.service
         unit:                    snmpd.service
         unit:                  switchd.service
         unit:                   sx_sdk.service
         unit:             wd_keepalive.service

To show the units configured for a specific profile, run the nv show system telemetry software-stats systemd unit-profile <profile-id> command:

cumulus@switch:~$ nv show system telemetry software-stats systemd unit-profile CUSTOM1
        operational    applied      
------  -------------  -------------
[unit]  nginx.service  nginx.service
[unit]  nvued.service  nvued.service

To show if exporting software statistics is enabled, run the nv show system telemetry software-stats systemd export command:

cumulus@switch:~$ nv show system telemetry software-stats systemd export  
       applied 
-----  --------
state  enabled

gRPC OTLP Export

To configure the open telemetry export destination:

  1. Configure gRPC to communicate with the collector by providing the collector destination IP address or hostname. Specify the port to use for communication if it is different from the default port 8443:

    cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.100 port 4317
    cumulus@switch:~$ nv config apply
    
  2. Configure an X.509 certificate to secure the gRPC connection:

    cumulus@switch:~$ nv set system telemetry export otlp grpc cert-id <ca-certificate>
    cumulus@switch:~$ nv config apply
    

By default, OTLP export is in secure mode that requires a CA certificate. For connections without a configured certificate, you must enable insecure mode with the nv set system telemetry export otlp grpc insecure enabled command.

Customize Export

By default, the switch exports all statistics enabled globally (with the nv set system telemetry <statistics> command) to all configured OTLP destinations. If you want to export different metrics to different OTLP destinations, you can customize the export by specifying a statistics group to control which statistics you export and the sample interval for a destination.

Statistics groups inherit global OTLP export configurations by default. More specific configuration under a statistics group, such as enabling or disabling a statistic type or changing the sample interval overrides any global OTLP configuration.

The following example:

  • Configures STAT-GROUP1 to export all platform statistics (platform-stats) but not interface statistics (interface-stats).
  • Applies the STAT-GROUP1 configuration to the OTLP destination 10.1.1.100.
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP1 platform-stats export state enabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP1 interface-stats export state disabled
cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.100 stats-group STAT-GROUP1
cumulus@switch:~$ nv config apply

The following example:

  • Configures STAT-GROUP2 to inherit all statistic configuration from the global telemetry configuration, but changes the sample interval of router statistics to 100:
  • Applies the STAT-GROUP2 configuration to the OTLP destination 10.1.1.200.
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP2 router sample-interval 100
cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.200 stats-group STAT-GROUP2
cumulus@switch:~$ nv config apply

The following example:

  • Configures STAT-GROUP3 to disable histogram (histogram) and buffer (buffer-stats) statistics, and enables all platform statistics(platform-stats) except for disk state:
  • Applies the STAT-GROUP3 configuration to the OTLP destination 10.1.1.30.
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP3 buffer-stats export state disabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP3 histogram export state disabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP3 platform-stats export state enabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP3 platform-stats class disk state disabled
cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.30 stats-group STAT-GROUP3
cumulus@switch:~$ nv config apply

The following example:

  • Configures STAT-GROUP4 to disable histogram (histogram) statistics, and enables LLDP statistics (lldp-stats) and software statistics (software-stats).
  • Sets the sample interval of lldp statistics to 40.
  • Applies the STAT-GROUP4 configuration to the OTLP destination 10.1.1.30.
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP4 histogram export state disabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP4 lldp export state enabled
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP4 lldp sample-interval 40
cumulus@switch:~$ nv set system telemetry stats-group STAT-GROUP4 software-stats systemd export state enabled
cumulus@switch:~$ nv set system telemetry export otlp grpc destination 10.1.1.30 stats-group STAT-GROUP4
cumulus@switch:~$ nv config apply

Show Telemetry Export Configuration

To show the telemetry export configuration, run the nv show system telemetry export command:

cumulus@switch:~$ nv show system telemetry export
                    applied   pending 
------------------  --------  --------
vrf                 default   default 
otlp                                  
  state             disabled  disabled
  grpc                                
    insecure  disabled  disabled
    port            8443      8443    
    [destination]             

To show the OTLP gRPC destination configuration, run the nv show system telemetry export otlp grpc destination command.

Static Labels

You can apply static labels to switches and individual interfaces to configure descriptions for devices and interface roles. Exported OTLP data includes these label names and descriptions.

  • Cumulus Linux supports up to 10 device labels and up to 10 interface labels.
  • Label name and description strings can include alphanumeric characters with underscores, periods, or dashes. If spaces are included in the string, wrap the entire string inside double or single quotes.

To configure a switch device label Data_Center_Location and a string identifying it as part of Data_Center_B:

cumulus@switch:~$ nv set system telemetry label "Data Center Location" description "Data Center B"
cumulus@switch:~$ nv config apply

Validate device label configuration with the nv show system telemetry label command:

cumulus@switch:~$ nv show system telemetry label
                      description  
--------------------  -------------
Data Center Location  Data Center B

To configure a switch interface label interface_swp10_label with the description Server 10 connection:

cumulus@switch:~$ nv set interface swp10 telemetry label "interface_swp10_label" description "Server 10 connection"
cumulus@switch:~$ nv config apply

Validate the configuration with the nv show system telemetry label command:

cumulus@switch:~$ nv show system telemetry label
                      description  
--------------------  -------------
Data Center Location  Data Center B

Validate interface label configuration with the nv show interface <interface> telemetry label command:

cumulus@switch:~$ nv show interface swp10 telemetry label
                       description         
---------------------  --------------------
interface_swp10_label  Server 10 connection

Telemetry Data Format

Cumulus Linux exports statistics and histogram data in the formats defined in this section.

Adaptive Routing Statistic Format

When you enable adaptive routing telemetry, the switch exports the following statistics:

MetricDescription
nvswitch_ar_congestion_changesThe number of adaptive routing change events triggered due to congestion or link-down.
Example JSON data for nvswitch_ar_congestion_changes:

Buffer Statistic Format

The switch collects and exports the following interface and switch, buffer occupancy and watermark statistics when you configure the nv set system telemetry buffer-stats export state enable command:

NameDescription
nvswitch_interface_shared_buffer_port_pg_time_since_clearTime in milliseconds since buffer watermarks were last cleared.
nvswitch_interface_shared_buffer_port_pg_curr_occupancyCurrent buffer occupancy.
nvswitch_interface_shared_buffer_port_pg_watermarkMaximum buffer occupancy.
nvswitch_interface_shared_buffer_port_pg_desc_curr_occupancyCurrent buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_pg_desc_watermarkMaximum buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_pg_watermark_recorded_maxHighest maximum buffer occupancy recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_pg_desc_watermark_recorded_maxHighest maximum buffer occupancy for descriptors recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_ingress_pool_curr_occupancyCurrent ingress pool buffer occupancy.
nvswitch_interface_shared_buffer_port_ingress_pool_watermarkMaximum ingress pool buffer occupancy.
nvswitch_interface_shared_buffer_port_ingress_pool_desc_curr_occupancyCurrent ingress pool buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_ingress_pool_desc_watermarkMaximum ingress pool buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_ingress_pool_watermark_recorded_maxHighest maximum ingress pool buffer occupancy recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_ingress_pool_desc_watermark_recorded_maxHighest maximum ingress pool buffer occupancy for descriptors recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_tc_curr_occupancyCurrent buffer occupancy for traffic class.
nvswitch_interface_shared_buffer_port_tc_time_since_clearTime in milliseconds since buffer watermarks were last cleared.
nvswitch_interface_shared_buffer_port_tc_watermarkMaximum buffer occupancy for traffic class.
nvswitch_interface_shared_buffer_port_tc_desc_curr_occupancyCurrent buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_tc_desc_watermarkMaximum buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_tc_watermark_recorded_maxHighest maximum buffer occupancy recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_tc_desc_watermark_recorded_maxHighest maximum buffer occupancy for TC descriptors recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_egress_pool_curr_occupancyCurrent egress pool buffer occupancy.
nvswitch_interface_shared_buffer_port_egress_pool_watermarkMaximum egress pool buffer occupancy.
nvswitch_interface_shared_buffer_port_egress_pool_desc_curr_occupancyCurrent egress pool buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_egress_pool_desc_watermarkMaximum egress pool buffer occupancy for descriptors.
nvswitch_interface_shared_buffer_port_egress_pool_watermark_recorded_maxHighest maximum egress pool buffer occupancy recorded since running sdk_stats.
nvswitch_interface_shared_buffer_port_egress_pool_desc_watermark_recorded_maxHighest maximum egress pool buffer occupancy for pool desc recorded since running sdk_stats.
nvswitch_interface_shared_buffer_mc_port_curr_occupancyCurrent buffer occupancy for multicast port.
nvswitch_interface_shared_buffer_mc_port_watermarkMaximum buffer occupancy for multicast port.
nvswitch_interface_shared_buffer_mc_port_watermark_maxHighest maximum buffer occupancy for multicast port recorded since running sdk_stats.
nvswitch_shared_buffer_mc_sp_curr_occupancyCurrent buffer occupancy for multicast switch priority.
nvswitch_shared_buffer_mc_sp_watermarkMaximum buffer occupancy for multicast switch priority.
nvswitch_shared_buffer_mc_sp_watermark_maxHighest maximum buffer occupancy for multicast switch priority recorded since running sdk_stats.
nvswitch_shared_buffer_pool_curr_occupancyCurrent pool buffer occupancy.
nvswitch_shared_buffer_pool_watermarkMaximum pool buffer occupancy
nvswitch_shared_buffer_pool_watermark_maxHighest maximum pool buffer occupancy for multicast switch priority recorded since running sdk_stats.
nvswitch_interface_headroom_buffer_pg_curr_occupancyCurrent headroom buffer occupancy for port buffer.
nvswitch_interface_headroom_buffer_pg_watermarkMaximum pool headroom buffer occupancy for port buffer.
nvswitch_interface_headroom_buffer_pg_watermark_recorded_maxHighest maximum headroom buffer occupancy for port buffer recorded since running sdk_stats.
nvswitch_interface_headroom_shared_buffer_curr_occupancyCurrent headroom buffer occupancy for port shared buffer.
nvswitch_interface_headroom_shared_buffer_watermarkMaximum headroom buffer occupancy for port shared buffer.
nvswitch_interface_headroom_shared_buffer_watermark_recorded_maxHighest maximum headroom buffer occupancy for port shared buffer recorded since running sdk_stats.
nvswitch_interface_headroom_buffer_pool_curr_occupancyCurrent headroom buffer occupancy for port shared pool buffer
nvswitch_interface_headroom_buffer_pool_watermarkMaximum headroom buffer occupancy for port shared pool buffer.
nvswitch_interface_headroom_buffer_pool_watermark_recorded_maxHighest maximum headroom buffer occupancy for port shared pool buffer.

Example JSON data for nvswitch_interface_shared_buffer_port_tc_time_since_clear:

Example JSON data for nvswitch_interface_shared_buffer_port_pg_time_since_clear:

Control Plane Statistic Format

When you enable control plane statistic telemetry, the switch exports the following statistics:

NameDescription
nvswitch_control_plane_tx_packetsControl plane transmit packets.
nvswitch_control_plane_tx_bytesControl plane transmit bytes.
nvswitch_control_plane_rx_packetsControl plane receive packets.
nvswitch_control_plane_rx_bytesControl plane receive bytes.
nvswitch_control_plane_rx_buffer_dropsControl plane receive buffer drops.
nvswitch_control_plane_trap_rx_packetsControl plane trap group receive packets.
nvswitch_control_plane_trap_rx_event_countControl plane trap group receive events.
nvswitch_control_plane_trap_rx_dropControl plane trap group receive drops.
nvswitch_control_plane_trap_rx_bytesControl plane trap group receive bytes.
nvswitch_control_plane_trap_group_rx_packetsControl plane trap group receive packets.
nvswitch_control_plane_trap_group_rx_bytesControl plane trap group receive bytes.
nvswitch_control_plane_trap_group_pkt_violationsControl plane trap group packet violations.
Example JSON data for nvswitch_control_plane_trap_rx_drop:

Histogram Data Format

The histogram data samples that the switch exports to the OTEL collector are histogram data points that include the histogram bucket (bin) counts and the respective queue length size boundaries for each bucket. Latency and counter histogram data are also exported, if configured.

Latency histogram bucket counts do not increment in exported telemetry data if there are no packets transmitted in the traffic class during the sample interval.

The switch sends a sample with the following names for each interface enabled for ingress and egress buffer, latency, and counter histogram collection:

NameDescription
nvswitch_histogram_interface_egress_bufferHistogram interface egress buffer queue depth.
nvswitch_histogram_interface_ingress_bufferHistogram interface ingress buffer queue depth.
nvswitch_histogram_interface_counterHistogram interface counter data.
nvswitch_histogram_interface_latencyHistogram interface latency data.

Example JSON data for interface_ingress_buffer:

Example JSON data for interface_egress_buffer:

Example JSON data for interface_counter:

Example JSON data for interface_latency:

Interface Statistic Format

The interface statistic data samples that the switch exports to the OTEL collector are gauge streams that include the interface name as an attribute and the statistics value reported in the asDouble exemplar.

NameDescription
nvswitch_interface_oper_stateInterface operational state as a bitmap: (None[0], Up[1], Down[2], Invalid[4], Error[8])
nvswitch_interface_dot3_control_in_unknown_opcodesInput 802.3 unknown opcode counter.
nvswitch_interface_dot3_in_pause_framesInput 802.3 pause frame counter.
nvswitch_interface_dot3_out_pause_framesOutput 802.3 pause frame counter.
nvswitch_interface_dot3_stats_alignment_errors802.3 alignment error counter.
nvswitch_interface_dot3_stats_carrier_sense_errors802.3 interface carrier sense error counter.
nvswitch_interface_dot3_stats_deferred_transmissions802.3 deferred transmission counter.
nvswitch_interface_dot3_stats_excessive_collisions802.3 excessive collisions counter.
nvswitch_interface_dot3_stats_fcs_errors802.3 FCS error counter.
nvswitch_interface_dot3_stats_frame_too_longs802.3 excessive frame size counter.
nvswitch_interface_dot3_stats_internal_mac_receive_errors802.3 internal MAC receive error counter.
nvswitch_interface_dot3_stats_internal_mac_transmit_errors802.3 internal MAC transmit error counter.
nvswitch_interface_dot3_stats_late_collisions802.3 late collisions counter.
nvswitch_interface_dot3_stats_multiple_collision_frames802.3 multiple collision frames counter.
nvswitch_interface_dot3_stats_single_collision_frames802.3 single collision frames counter.
nvswitch_interface_dot3_stats_sqe_test_errors802.3 SQE test error counter.
nvswitch_interface_dot3_stats_symbol_errors802.3 symbol error counter.
nvswitch_interface_802_dot3_a_frames_transmitted_okNumber of 802.3a frames transmitted.
nvswitch_interface_performance_marked_packetsInterface performance marked packets, with marking as ece or ecn.
nvswitch_interface_discards_ingress_generalInterface ingress general discards counter.
nvswitch_interface_discards_ingress_policy_engineInterface ingress policy engine discards counter.
nvswitch_interface_discards_ingress_vlan_membershipInterface ingress VLAN membership filter discards counter.
nvswitch_interface_discards_ingress_tag_frame_typeInterface ingress VLAN tag filter discards counter.
nvswitch_interface_discards_egress_vlan_membershipInterface egress VLAN emmbership filter discards counter.
nvswitch_interface_discards_loopback_filterInterface loopback filter discards counter.
nvswitch_interface_discards_egress_generalInterface egress general discards counter.
nvswitch_interface_discards_egress_link_downInterface egress link down discards counter.
nvswitch_interface_discards_egress_hoqInterface egress head-of-queue timeout discards.
nvswitch_interface_discards_port_isolationInterface port isolation filter discards.
nvswitch_interface_discards_egress_policy_engineInterface egress policy engine discards.
nvswitch_interface_discards_ingress_tx_link_downInterface ingress transmit link down discards.
nvswitch_interface_discards_egress_stp_filterInterface egress spanning tree filter discards.
nvswitch_interface_discards_egress_hoq_stallInterface egress head-of-queue stall discards.
nvswitch_interface_discards_egress_sllInterface egress switch lifetime limit discards.
nvswitch_interface_discards_ingress_discard_allInterface total ingress discards.
nvswitch_interface_tx_stats_pkts64octetsTotal packets transmitted, 64 octets in length.
nvswitch_interface_tx_stats_pkts65-to127octetsTotal packets transmitted, 64 octets in length.
nvswitch_interface_tx_stats_pkts256-to511octetsTotal packets transmitted, 256-511 octets in length.
nvswitch_interface_tx_stats_pkts512-to1023octetsTotal packets transmitted, 512-1023 octets in length.
nvswitch_interface_tx_stats_pkts1024-to1518octetsTotal packets transmitted, 1024-1518 octets in length.
nvswitch_interface_tx_stats_pkts1519-to2047octetsTotal packets transmitted, 1519-2047 octets in length.
nvswitch_interface_tx_stats_pkts2048-to4095octetsTotal packets transmitted, 2048-4095 octets in length.
nvswitch_interface_tx_stats_pkts4096-to8191octetsTotal packets transmitted, 4096-8191 octets in length.
nvswitch_interface_tx_stats_pkts8192-to10239octetsTotal packets transmitted, 8192-10239 octets in length.
nvswitch_interface_ether_stats_pkts64octetsTotal packets received, 64 octets in length.
nvswitch_interface_ether_stats_pkts65to127octetsTotal packets received, 65-127 octets in length.
nvswitch_interface_ether_stats_pkts128to255octetsTotal packets received, 128-255 octets in length.
nvswitch_interface_ether_stats_pkts256to511octetsTotal packets received, 256-511 octets in length.
nvswitch_interface_ether_stats_pkts512to1023octetsTotal packets received, 512-1023 octets in length.
nvswitch_interface_ether_stats_pkts1024to1518octetsTotal packets received, 1024-1518 octets in length.
nvswitch_interface_ether_stats_pkts1519to2047octetsTotal packets received, 1519-2047 octets in length.
nvswitch_interface_ether_stats_pkts2048to4095octetsTotal packets received, 2048-4095 octets in length.
nvswitch_interface_ether_stats_pkts4096to8191octetsTotal packets received, 4096-8191 octets in length.
nvswitch_interface_ether_stats_pkts8192to10239octetsTotal packets received, 8192-10239 octets in length.
nvswitch_interface_carrier_up_changes_totalTotal number of carrier up transitions for the interface.
nvswitch_interface_carrier_last_change_time_msTime of last carrier change for the interface as Unix epoch timestamp, with millisecond granularity.
nvswitch_interface_carrier_down_changes_totalTotal number of carrier down transitions for the interface.
nvswitch_interface_carrier_changes_totalTotal number of carrier changes for the interface.
nvswitch_interface_mtu_bytesOperational MTU for the interface in bytes.
nvswitch_interface_infoProvides information about the interface: MAC address, duplex, ifalias, interface name, operstate.
nvswitch_interface_iface_idThe ifindex for the interface.
nvswitch_interface_flagsKernel device flags set for an interface as an integer representing the kernel net_device flags bitmask.
nvswitch_interface_proto_downInterface protocol down status.
nvswitch_interface_oper_aggregate_speedSpeed in bits per second for the connected interface.
nvswitch_interface_number_of_lanesNumber of lanes used by the interface.
nvswitch_interface_if_in_broadcast_pktsNumber of interface in broadcast packets.
nvswitch_interface_if_in_discardsNumber of interface in discards.
nvswitch_interface_if_in_errorsNumber of interface in errors.
nvswitch_interface_if_in_multicast_pktsNumber of interface in multicast packets.
nvswitch_interface_if_in_octetsNumber of interface in octets.
nvswitch_interface_if_in_ucast_pktsNumber of interface in unicast packets.
nvswitch_interface_if_in_unknown_protosNumber of interface in unknown protocols.
nvswitch_interface_if_out_broadcast_pktsNumber of interface out broadcast packets.
nvswitch_interface_if_out_discardsNumber of interface out discards.
nvswitch_interface_if_out_errorsNumber of interface out errors.
nvswitch_interface_if_out_multicast_pktsNumber of interface out multicast packets.
nvswitch_interface_if_out_octetsNumber of interface out octets.
nvswitch_interface_if_out_octetsNumber of interface out unicast packets.

The switch collects and exports the following additional interface traffic class statistics when you configure the nv set system telemetry interface-stats egress-buffer traffic-class <class> command:

NameDescription
nvswitch_interface_tc_tx_bc_framesInterface egress traffic class transmit broadcast frames counter.
nvswitch_interface_tc_tx_ecn_marked_tcInterface egress traffic class transmit ECN marked counter.
nvswitch_interface_tc_tx_framesInterface egress traffic class trasmit frames counter.
nvswitch_interface_tc_tx_mc_framesInterface egress traffic class trasmit multicast frames counter.
nvswitch_interface_tc_tx_no_buffer_discard_ucInterface egress traffic class transmit unicast no buffer discard counter.
nvswitch_interface_tc_tx_octetInterface egress traffic class transmit bytes counter.
nvswitch_interface_tc_tx_queueInterface egress traffic class transmit queue counter.
nvswitch_interface_tc_tx_uc_framesInterface egress traffic class transmit unicast frames counter.
nvswitch_interface_tc_tx_wred_discardInterface egress traffic class transmit WRED discard counter.

The switch collects and exports the following additional interface priority group statistics when you configure the nv set system telemetry interface-stats ingress-buffer priority-group <priority> command:

NameDescription
nvswitch_interface_pg_rx_buffer_discardInterace ingress priority group receive buffer discard counter.
nvswitch_interface_pg_rx_framesInterface ingress priority group receive frames counter.
nvswitch_interface_pg_rx_octetsInterface ingress priority group receive bytes counter.
nvswitch_interface_pg_rx_shared_buffer_discardInterface ingress priority group receive shared buffer discard counter.
nvswitch_interface_pg_rx_uc_framesInterface receive priority group unicast frames counter.
nvswitch_interface_pg_rx_mc_framesInterface receive priority group multicast frames counter.
nvswitch_interface_pg_rx_bc_framesInterface receive priority group broadcast frames counter.
nvswitch_interface_pg_tx_octetsInterface receive priority group transmit bytes counter.
nvswitch_interface_pg_tx_uc_framesInterface receive priority group transmit unicast frames counter.
nvswitch_interface_pg_tx_mc_framesInterface receive priority group transmit multicast frames counter.
nvswitch_interface_pg_tx_bc_framesInterface receive priority group transmit broadcast frames counter.
nvswitch_interface_pg_tx_framesInterface receive priority group transmit frames counter.
nvswitch_interface_pg_rx_pauseInterface receive priority group receive pause counter.
nvswitch_interface_pg_rx_pause_durationInterface receive priority group receive pause duration counter.
nvswitch_interface_pg_tx_pauseInterface receive priority group transmit pause counter.
nvswitch_interface_pg_tx_pause_durationInterface receive priority group transmit pause duration counter.
nvswitch_interface_pg_rx_pause_transitionInterface receive priority group receive pause transition counter.
nvswitch_interface_pg_rx_discardInterface receive priority group receive discard counter.

The switch collects and exports the following additional interface switch priority statistics when you configure the nv set system telemetry interfaces-stats switch-priority <priority> command:

NameDescription
nvswitch_interface_sp_rx_bc_framesReceived broadcast counter for the switch priority
nvswitch_interface_sp_rx_discardReceive discard counter for the switch priority
nvswitch_interface_sp_rx_framesReceive frame counter for the switch priority.
nvswitch_interface_sp_rx_mc_framesReceive multicast frame counter for the switch priority.
nvswitch_interface_sp_rx_octetsReceive octets counter for the switch priority.
nvswitch_interface_sp_rx_pauseReceive pause counter for the switch priority.
nvswitch_interface_sp_rx_pause_durationRecieve pause duration counter for the switch priority.
nvswitch_interface_sp_rx_pause_transitionRecieve pause transition counter for the switch priority.
nvswitch_interface_sp_rx_uc_framesReceive unicast frame counter for the switch priority.
nvswitch_interface_sp_tx_bc_framesTransmit broadcast frame counter for the switch priority.
nvswitch_interface_sp_tx_framesTransmit frame counter for the switch priority.
nvswitch_interface_sp_tx_mc_framesTransmit multicast frame counter for the switch priority.
nvswitch_interface_sp_tx_octetsTransmit octets counter for the switch priority.
nvswitch_interface_sp_tx_pauseTransmit pause counter for the switch priority.
nvswitch_interface_sp_tx_pause_durationTransmit pause duration for the switch priority.
nvswitch_interface_sp_tx_uc_framesTransmit unicast frame counter for the switch priority.

The switch collects and exports the following additional interface statistics when you configure the nv set system telemetry interface-stats class phy state enabled command:

NameDescription
nvswitch_interface_phy_stats_phy_received_bitsTotal amount of traffic (bits) received.
nvswitch_interface_phy_stats_phy_symbol_errorsError bits not corrected by the FEC correction algorithm or when FEC is not active.
nvswitch_interface_phy_stats_phy_effective_errorsNumber of errors after FEC is applied.
nvswitch_interface_phy_stats_phy_raw_errorsError bits identified on lane 0 through lane 7. When FEC is enabled, this induction corresponds to corrected errors.
nvswitch_interface_phy_stats_raw_berraw_ber_coef_laneX*10^(raw_ber_magnitude)
nvswitch_interface_phy_stats_symbol_berSymbol BER errors.
nvswitch_interface_phy_layer_time_since_last_clearTime since counters were cleared.
nvswitch_interface_phy_layer_fec_per_lane_correctionsFEC corrections per lane.
nvswitch_interface_phy_layer_fec_block_state_countNumber of FEC block states.

Interface Example JSON

Example JSON data for nvswitch_interface_oper_state:

Example JSON data for nvswitch_interface_dot3_stats_fcs_errors:

Example JSON data for nvswitch_interface_if_in_broadcast_pkts:

Example JSON data for nvswitch_interface_if_in_discards:

Example JSON data for nvswitch_interface_if_in_errors:

Example JSON data for nvswitch_interface_if_in_multicast_pkts:

Example JSON data for nvswitch_interface_if_in_octets:

Example JSON data for nvswitch_interface_if_in_ucast_pkts:

Example JSON data for nvswitch_interface_if_in_unknown_protos:

Example JSON data for nvswitch_interface_if_out_broadcast_pkts:

Example JSON data for nvswitch_interface_if_out_discards:

Example JSON data for nvswitch_interface_if_out_errors:

Example JSON data for nvswitch_interface_if_out_multicast_pkts:

Example JSON data for nvswitch_interface_if_out_octets:

Example JSON data for nvswitch_interface_if_out_ucast_pkts:

Example JSON data for nvswitch_interface_802_dot3_a_frames_transmitted_ok:

PHY Example JSON

Example JSON data for nvswitch_interface_phy_layer_fec_block_state_count:

Example JSON data for nvswitch_interface_phy_layer_fec_per_lane_corrections:

Example JSON data for nvswitch_interface_phy_layer_time_since_last_clear:

Example JSON data for nvswitch_interface_phy_stats_effective_ber:

Example JSON data for nvswitch_interface_phy_stats_symbol_ber:

LLDP Statistic Format

When you enable LLDP statistic telemetry, the switch exports the following statistics:

NameDescription
nvswitch_lldp_chassis_infoLLDP chassis information.
nvswitch_lldp_chassis_capabilitiesLLDP chassis capabilities as a bitmap. The capabilities are defined in IEEE 802.1AB.
nvswitch_lldp_neighbor_ageLLDP neighbor age information in seconds.
nvswitch_lldp_neighbor_capabilitiesLLDP neighbor capabilities as a bitmap. The capabilities are defined in IEEE 802.1AB.
nvswitch_lldp_neighbor_infoLLDP neighbor information.
nvswitch_lldp_neighbor_ttlLLDP neighbor port TTL in seconds.
nvswitch_lldp_neighbor_management_address-infoLLDP neighbor management address information.

Example JSON data for nvswitch_lldp_chassis_info:

Example JSON data for nvswitch_lldp_chassis_capabilities:

Example JSON data for nvswitch_lldp_neighbor_age:

Example JSON data for nvswitch_lldp_neighbor_capabilities:

Example JSON data for nvswitch_lldp_neighbor_info:

Example JSON data for nvswitch_lldp_neighbor_ttl:

Example JSON data for nvswitch_lldp_neighbor_management_address_info:

Platform Statistic Format

When you enable platform statistic telemetry globally, or when you enable telemetry for the individual components, the switch exports the following statistics:

CPU statistics include the CPU core number and operation mode (user, system, idle, iowait, irq, softirq, steal, guest, guest_nice).

NameDescription
node_cpu_core_throttles_totalNumber of times a CPU core has been throttled.
node_cpu_frequency_max_hertzMaxiumum CPU thread frequency in hertz.
node_cpu_frequency_min_hertzMinimum CPU thread frequency in hertz.
node_cpu_guest_seconds_totalSeconds the CPUs spent in guests for each mode.
node_cpu_package_throttles_totalNumber of times the CPU package has been throttled.
node_cpu_scaling_frequency_hertzCurrent scaled CPU thread frequency in hertz.
node_cpu_scaling_frequency_max_hertzMaximum scaled CPU thread frequency in hertz.
node_cpu_scaling_frequency_min_hertzMinimum scaled CPU thread frequency in hertz.
node_cpu_seconds_totalSeconds the CPU spent in each mode.
NameDescription
node_disk_ata_rotation_rate_rpmATA disk rotate rate in RPMs. (0 for SSDs).
node_disk_ata_write_cacheATA disk write cache presence.
node_disk_ata_write_cache_enabledATA disk write cache status (enabled or disabled).
node_disk_discard_time_seconds_totalTotal number of seconds spent by all discards.
node_disk_discarded_sectors_totalTotal number of sectors discarded successfully.
node_disk_discards_completed_totalTotal number of discards discards completed.
node_disk_discards_merged_totalTotal number of discards merged.
node_disk_flush_requests_time_seconds_totalTotal number of seconds spent by all flush requests.
node_disk_flush_requests_totalThe total number of flush requests completed successfully.
node_disk_infoDisk information from /sys/block/<block_device>.
node_disk_io_nowNumber of I/Os in progress.
node_disk_io_time_seconds_totalTotal seconds spent during I/O.
node_disk_io_time_weighted_seconds_totalWeighted number of seconds spent during I/O.
node_disk_read_bytes_totalTotal number of bytes read successfully.
node_disk_read_time_seconds_totalTotal number of seconds spent by all reads.
node_disk_reads_completed_totalTotal number of reads completed successfully.
node_disk_reads_merged_totalTotal number of reads merged.
node_disk_write_time_seconds_totalTotal number of seconds spent by all writes.
node_disk_writes_completed_totalTotal number of writes completed successfully.
node_disk_writes_merged_totalNumber of writes merged.
node_disk_written_bytes_totalTotal number of bytes written successfully.
NameDescription
node_filesystem_avail_bytesFilesystem space available to non-root users in bytes.
node_filesystem_device_errorWhether an error occurred while getting statistics for the given device.
node_filesystem_filesFilesystem total file nodes.
node_filesystem_files_freeFilesystem total free file nodes.
node_filesystem_free_bytesFilesystem free space in bytes.
node_filesystem_readonlyFilesystem read-only status.
node_filesystem_size_bytesFilesystem size in bytes.
NameDescription
node_memory_Active_anon_bytes/proc/meminfo Active_anon bytes.
node_memory_Active_bytes/proc/meminfo Active bytes.
node_memory_Active_file_bytes/proc/meminfo Active_file bytes.
node_memory_AnonHugePages_bytes/proc/meminfo AnonHugePages bytes.
node_memory_AnonPages_bytes/proc/meminfo AnonPages bytes.
node_memory_Bounce_bytes/proc/meminfo Bounce bytes.
node_memory_Buffers_bytes/proc/meminfo Buffers bytes.
node_memory_Cached_bytes/proc/meminfo Cached bytes.
node_memory_CommitLimit_bytes/proc/meminfo CommitLimit bytes.
node_memory_Committed_AS_bytes/proc/meminfo Committed_AS bytes.
node_memory_DirectMap1G_bytes/proc/meminfo DirectMap1G bytes.
node_memory_DirectMap2M_bytes/proc/meminfo DirectMap2M bytes.
node_memory_DirectMap4k_bytes/proc/meminfo DirectMap4k bytes.
node_memory_Dirty_bytes/proc/meminfo Dirty bytes.
node_memory_FileHugePages_bytes/proc/meminfo FileHugePages bytes.
node_memory_FilePmdMapped_bytes/proc/meminfo FilePmdMapped bytes.
node_memory_HardwareCorrupted_bytes/proc/meminfo HardwareCorrupted bytes.
node_memory_HugePages_Free/proc/meminfo HugePages_Free.
node_memory_HugePages_Rsvd/proc/meminfo HugePages_Rsvd.
node_memory_HugePages_Surp/proc/meminfo HugePages_Surp.
node_memory_HugePages_Total/proc/meminfo HugePages_Total.
node_memory_Hugepagesize_bytes/proc/meminfo Hugepagesize bytes.
node_memory_Hugetlb_bytes/proc/meminfo Hugetlb bytes.
node_memory_Inactive_anon_bytes/proc/meminfo Inactive_anon bytes.
node_memory_Inactive_bytes/proc/meminfo Inactive bytes.
node_memory_Inactive_file_bytes/proc/meminfo Inactive_file bytes.
node_memory_KReclaimable_bytes/proc/meminfo KReclaimable bytes.
node_memory_KernelStack_bytes/proc/meminfo KernelStack bytes.
node_memory_Mapped_bytes/proc/meminfo Mapped bytes.
node_memory_MemAvailable_bytes/proc/meminfo MemAvailable bytes.
node_memory_MemFree_bytes/proc/meminfo MemFree bytes.
node_memory_MemTotal_bytes/proc/meminfo MemTotal bytes.
node_memory_Mlocked_bytes/proc/meminfo Mlocked bytes.
node_memory_NFS_Unstable_bytes/proc/meminfo NFS_Unstable bytes.
node_memory_PageTables_bytes/proc/meminfo PageTables bytes.
node_memory_Percpu_bytes/proc/meminfo Percpu bytes.
node_memory_SReclaimable_bytes/proc/meminfo SReclaimable bytes.
node_memory_SUnreclaim_bytes/proc/meminfo SUnreclaim bytes.
node_memory_SecPageTables_bytes/proc/meminfo SecPageTables bytes.
node_memory_ShmemHugePages_bytes/proc/meminfo ShmemHugePages bytes.
node_memory_ShmemPmdMapped_bytes/proc/meminfo ShmemPmdMapped bytes.
node_memory_Shmem_bytes/proc/meminfo Shmem bytes.
node_memory_Slab_bytes/proc/meminfo Slab bytes.
node_memory_SwapCached_bytes/proc/meminfo SwapCached bytes.
node_memory_SwapFree_bytes/proc/meminfo SwapFree bytes.
node_memory_SwapTotal_bytes/proc/meminfo SwapTotal bytes.
node_memory_Unevictable_bytes/proc/meminfo Unevictable bytes.
node_memory_VmallocChunk_bytes/proc/meminfo VmallocChunk bytes.
node_memory_VmallocTotal_bytes/proc/meminfo VmallocTotal bytes.
node_memory_VmallocUsed_bytes/proc/meminfo VmallocUsed bytes.
node_memory_WritebackTmp_bytes/proc/meminfo WritebackTmp bytes.
node_memory_Writeback_bytes/proc/meminfo Writeback bytes.
node_memory_Zswap_bytes/proc/meminfo Zswap bytes.
node_memory_Zswapped_bytes/proc/meminfo Zswapped bytes.
NameDescription
nvswitch_env_fan_cur_speedCurrent fan speed in RPM.
nvswitch_env_fan_dirFan direction (0: Front2Back, 1: Back2Front).
nvswitch_env_fan_max_speedFan maximum speed in RPM.
nvswitch_env_fan_min_speedFan minimum speed in RPM.
nvswitch_env_fan_stateFan status (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
nvswitch_env_psu_capacityPSU capacity in watts.
nvswitch_env_psu_currentPSU current in amperes.
nvswitch_env_psu_powerPSU power in watts.
nvswitch_env_psu_statePSU state (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
nvswitch_env_psu_voltagePSU voltage in volts.
nvswitch_env_temp_critCritical temperature threshold in centigrade.
nvswitch_env_temp_currentCurrent temperature in centigrade.
nvswitch_env_temp_maxMaximum temperature threshold in centigrade.
nvswitch_env_temp_minMinimum temperature threshold in centigrade.
nvswitch_env_temp_stateTemperature sensor status (0: ABSENT, 1: OK, 2: FAILED, 3: BAD).
MetricDescription
nvswitch_platform_tranceiver_vendor_infoThe transceiver vendor information, such as which port the transceiver plugs into, the date of manufacture, the revision, the name of the manufacturer, the manufacturer part number, the serial number, and the IEEE company ID of the vendor.
nvswitch_platform_tranceiver_infoGeneral information for the transceiver, such as which port the transceiver plugs into, the cable type, the cable length in meters, the status (plugged-enabled, plugged-disabled, plugged-error, or unplugged), the error status, the identifier, and the Ethernet compliance revision.
nvswitch_platform_tranceiver_temperatureThe temperature of the module in Celsius as a 64bit decimal value.
nvswitch_platform_tranceiver_temperature_alarmThe alarm status due to temperature crossing thresholds defined for the module. The value sent for the temperature alarm is a bit mask:
Bit 0: high_temp_alarm
Bit 1: low_temp_alarm
Bit 2: high_temp_warning
Bit 3: low_temp_warning
nvswitch_platform_tranceiver_temperature_threshold_infoTemperature thresholds defined for the module (low or high).
nvswitch_platform_tranceiver_voltageThe internally measured supply voltage for the module in volts (a 64bit decimal value).
nvswitch_platform_tranceiver_voltage_alarmThe alarm status due to Voltage crossing thresholds defined for the module:
Bit 0: high_vcc_alarm
Bit 1: low_vcc_alarm
Bit 2: high_vcc_warning
Bit 3: low_vcc_warning
nvswitch_platform_tranceiver_voltage_threshold_infoVoltage thresholds defined for the module. The level is alarm or warning. The threshold is low or high.
nvswitch_platform_transceiver_channel_powerThe transceiver channel power value in dBm units (logarithmic scale) for each channel in both rx and tx directions.
nvswitch_platform_transceiver_channel_power_alarmThe alarm state for power value compared with the defined thresholds for the module as a bit mask value for each channel and for both rx and tx directions:
Bit 0: tx_power_hi_al
Bit 1: l tx_power_lo_al
Bit 2: tx_power_hi_war
Bit 3: l tx_power_lo_war.
nvswitch_platform_transceiver_channel_power_threshold_infoThreshold information for the power for both rx and tx directions. These threshold values are applicable for all channels. The units are in dBm and represented by a 32bit decimal value.
nvswitch_platform_transceiver_channel_tx_bias_currenttx bias current measured for the channel in Amps units and represented by a 32bit decimal value.
nvswitch_platform_transceiver_channel_tx_bias_current_alarmtx bias current alarm state of tx bias current measure for the channel when compared to the threshold values for the channel defined for the module. This is a bit mask value:
Bit 0: tx_bias_hi_al
Bit 1: l tx_bias_lo_al
Bit 2: tx_bia_hi_war
Bit 3: l tx_bias_lo_war
nvswitch_platform_transceiver_channel_tx_bias_current_threshold_infotx bias current thresholds defined for the channel in Amps units and represented by a 32bit decimal value.

Environment Sensor Example JSON

Example JSON data for nvswitch_platform_environment_psu_state:

Example JSON data for nvswitch_platform_environment_temp_crit:

Example JSON data for nvswitch_platform_environment_temp_current:

Example JSON data for nvswitch_platform_environment_temp_current:

Transceiver Example JSON

Example JSON data for nvswitch_platform_transceiver_info:

Example JSON data for nvswitch_platform_transceiver_vendor_info:

Routing Metrics Format

When you enable layer 3 routing metrics telemetry, the switch exports the following statistics:

NameDescription
nvrouting_bgp_peer_stateBGP peer state: Established, Idle, Connect, Active, OpenSent.
nvrouting_bgp_peer_fsm_established_transitionsNumber of BGP peer state transitions to the Established state for the peer session.
nvrouting_bgp_peer_rib_adj_in_installedTracks the number of prefixes received from the neighbor that are installed in the RIB and actively used for forwarding.
nvrouting_bgp_peer_rib_adj_out_advertisedTracks the number of prefixes that are advertised to the neighbor after applying any policies.
nvrouting_bgp_peer_total_msgs_sentNumber of BGP messages sent to the neighbor.
nvrouting_bgp_peer_total_msgs_recvdNumber of BGP messages received from the neighbor.
nvrouting_bgp_peer_rib_adj_inNumber of IPv4, IPv6, and EVPN prefixes received from the peer after applying any policies. This count is the number of prefixes present in the post-policy Adj-RIB-In for the peer.
nvrouting_bgp_peer_socket_in_queueNumber of messages queued to be received from the BGP neighbor.
nvrouting_bgp_peer_socket_out_queueNumber of messages queued to be sent to the BGP neighbor.
nvrouting_bgp_peer_rx_updatesNumber of BGP messages received from the neighbor.
nvrouting_bgp_peer_tx_updatesNumber of BGP messages sent to the neighbor.
nvrouting_rib_countNumber of IPv4 and IPv6 routes in the IP routing table for each route source.
nvrouting_rib_count_connectedNumber of IPv4 connected routes in the IP routing table.
nvrouting_rib_count_bgpNumber of IPv4 BGP routes in the IP routing table.
nvrouting_rib_count_kernelNumber of IPv4 kernel routes in the IP routing table.
nvrouting_rib_count_staticNumber of IPv4 static routes in the IP routing table.
nvrouting_rib_count_pbrNumber of IPv4 PBR routes in the IP routing table.
nvrouting_rib_count_ospfNumber of IPv4 OSPF routes in the IP routing table.
nvrouting_rib_count_connected_ipv6Number of IPv6 connected routes in the IP routing table.
nvrouting_rib_count_bgp_ipv6Number of IPv6 BGP routes in the IP routing table.
nvrouting_rib_count_kernel_ipv6Number of IPv6 kernel routes in the IP routing table.
nvrouting_rib_count_static_ipv6Number of IPv6 static routes in the IP routing table.
nvrouting_rib_count_pbr_ipv6Number of IPv6 PBR routes in the IP routing table.
nvrouting_rib_count_ospf_ipv6Number of IPv6 OSPF routes in the IP routing table.
nvrouting_rib_nhg_countNumber of next hop groups in the routing table.

Example JSON data for nvrouting_bgp_peer_state:

Example JSON data for nvrouting_rib_count_bgp_ipv6:

Example JSON data for nvrouting_bgp_peer_fsm_established_transitions:

Example JSON data for nvrouting_bgp_peer_rib_adj_in_installed:

Example JSON data for nvrouting_bgp_peer_rib_adj_in:

Example JSON data for nvrouting_bgp_peer_rib_adj_out_advertised:

Example JSON data for nvrouting_bgp_peer_rx_updates:

Example JSON data for nvrouting_bgp_peer_socket_in_queue:

Example JSON data for nvrouting_bgp_peer_socket_out_queue:

Example JSON data for nvrouting_bgp_peer_state:

Example JSON data for nvrouting_bgp_peer_total_msgs_recvd:

Example JSON data for nvrouting_bgp_peer_total_msgs_sent:

Example JSON data for nvrouting_bgp_peer_tx_updates:

Example JSON data for nvrouting_rib_count:

Example JSON data for nvrouting_rib_nhg_count:

Software Statistics Format

When you enable systemd software statistic telemetry, the switch collects the following systemd unit statistics:

NameDescription
nvswitch_systemd_unit_main_pidThe main process ID of the unit.
nvswitch_systemd_unit_stateThe active status of the unit.
nvswitch_systemd_unit_runningThe running status of the unit.
nvswitch_systemd_unit_exe_pathThe executable path of the unit.
nvswitch_systemd_unit_restartThe systemd managed restart count of the unit.
nvswitch_systemd_unit_cpu_usage_secondsThe CPU usage of the unit (in seconds).
nvswitch_systemd_unit_memory_usage_bytesThe memory usage of the unit (in bytes).
nvswitch_systemd_unit_start_time_secondsThe start time of the unit in seconds since epoch.
nvswitch_systemd_unit_uptime_secondsThe uptime of the unit (in seconds).
nvswitch_systemd_unit_threadsThe number of threads in the unit.
nvswitch_systemd_unit_processesThe number of processes in the unit.

If you enable systemd process-level statistics, the switch collects the following metrics:

NameDescription
nvswitch_systemd_unit_process_parent_pidThe parent process ID.
nvswitch_systemd_unit_process_start_time_secondsThe start time of the process in seconds since epoch.
nvswitch_systemd_unit_process_stateThe process running status.
nvswitch_systemd_unit_process_threadsThe number of threads in the process.
nvswitch_systemd_unit_process_subprocessesThe number of child processes.
nvswitch_systemd_unit_process_context_switchesThe number of context switches based on context type since the main process was created.
nvswitch_systemd_unit_process_cpu_usage_secondsThe CPU usage of the process (user and kernel mode, including children).
nvswitch_systemd_unit_process_virtual_memory_usage_bytesThe virtual memory usage of the process (in bytes).
nvswitch_systemd_unit_process_resident_memory_usage_bytesThe resident memory usage of the process (in bytes).
nvswitch_systemd_unit_process_shared_memory_usage_bytesThe shared memory usage of the process (in bytes).

Example JSON data for nvswitch_systemd_unit_cpu_usage_seconds:

Example JSON data for nvswitch_systemd_unit_exe_path:

Example JSON data for nvswitch_systemd_unit_main_pid:

Example JSON data for nvswitch_systemd_unit_memory_usage_bytes:

Example JSON data for nvswitch_systemd_unit_start_time_seconds:

System Information Format

When you enable open telemetry with the nv set system telemetry export otlp state enabled command, the switch exports the following system information metrics to the configured OTEL collector by default:

NameDescription
node_boot_time_secondsNode boot time, in unixtime.
node_time_secondsSystem time in seconds since epoch (1970).
node_os_infoOperating system and image information, such as name and version.

Static Label Format

Device static labels are exported in the resource metric section of OTLP data:

Example JSON data for static device label:

Interface static labels are exported as attributes in the gauge metrics for each interface.

Example JSON data for static interface label:

Show Telemetry Health Metrics

To show telemetry health information, run the nv show system telemetry health command:

cumulus@switch:~$ nv show system telemetry health
                                     operational
---------------------------          -----------
service-status    
  nv-telemtry-service                active                      
  platform-stats-service             active
  histogram-export-service           active
  sdk-stats-service                  active
  routing-telemtry-service           inactive
internal-metrics 
  process 
    cpu-seconds                      3020
    memory-rss-kilobytes             182812672
    runtime-heap-alloc-bytes         28617960
    runtime-total-alloc-bytes        915541979208
    runtime-total-sys-memory-bytes   151368752
    uptime-seconds                   65313
[receivers]                          otlp/global
[receivers]                          prometheus/global
processors
  [memory-limiter]                   memory_limiter/1
  [batch]                            batch/1
[exporters]                          otlp/global

Export Destination Status
=======================
    Destination         Connectivity          Export Counter       Drop Counter
    -----------         ------------          --------------       ------------
    11.0.10.2:4317      Pass                  51534586             7087

Cumulus Linux Open telemetry also provides a set of internal metrics exposed by the collector to monitor its performance and behavior. These metrics are essential to understand the health and efficiency of the collector.

To show information about the telemetry health internal metrics, run the nv show system telemetry health internal-metrics command:

cumulus@switch:~$ nv show system telemetry health internal-metrics
                                     operational
------------------------             -----------
process        
  cpu-seconds                        029
  memory-rss-kilobytes               182812672
  runtime-heap-alloc-bytes           28617960
  runtime-total-alloc-bytes          915541979208
  runtime-total-sys-memory-bytes     151368752
  uptime-seconds                     65313
[receivers]                          otlp/global
[receivers]                          prometheus/global
processors
  [memory-limiter]                   memory_limiter/1
  [batch]                            batch/1
[exporters]                          otlp/global

To show information about the telemetry health internal metrics process, run the nv show system telemetry health internal-metrics process command:

cumulus@switch:~$ nv show system telemetry health internal-metrics process
                                   operational
------------------------           -----------
cpu-seconds                        029
memory-rss-kilobytes               182812672
runtime-heap-alloc-bytes           28617960
runtime-total-alloc-bytes          915541979208
runtime-total-sys-memory-bytes     151368752
uptime-seconds                     65313

To show information about the telemetry health internal metrics receivers, run the nv show system telemetry health internal-metrics receivers command:

cumulus@switch:~$ nv show system telemetry health internal-metrics receivers
Receivers            Accepted Metric Points      Refused Metric Points
---------            ----------------------      ---------------------
otlp/global          4967144                     0
prometheus/global    46989135                    0

To show information about the telemetry health internal metrics processors, run the nv show system telemetry health internal-metrics processors command:

cumulus@switch:~$ nv show system telemetry health internal-metrics processors
  Memory-limiter
  ==============
    memory_limiter/1
     Accepted Metric Points: 25002370
     Dropped Metric Points: 0
     Inserted Metric Points: 0
     Refused Metric Points: 0

  Batch Processor
  ===============
    batch/1
     Batch Send Size Bucket 10: 828620
     Batch Send Size Bucket 25: 828620
     Batch Send Size Bucket 50: 828620
     Batch Send Size Bucket 75: 828620
     Batch Send Size Bucket 100: 828620
...

To show information about the telemetry health internal metrics exporters, run the nv show system telemetry health internal-metrics exporters command:

cumulus@switch:~$ nv show system telemetry health internal-metrics exporters
Exporters       Enqueue Failed Metric Points   Queue Capacity   Queue Size   Send Failed Metric Points   Sent Metric Points
---------       ----------------------------   --------------   ----------   -------------------------   ------------------
otlp/global     0                              1000             0            7087                        52000844