gNMI Streaming - NVIDIA Docs

The gRPC Network Management Interface (gNMI) can collect and export system resources, interface, and counter information from NVOS to your gNMI client.

Configure the gNMI Agent using NVUE CLI Commands

The gNMI server feature state can be set over NVOS using simple NVUE CLI commands:

Show command:

Copy
Copied!

            
            nvos@switch:~$ nv show system gnmi-server
             operational    applied      
-----------  -------------  -----------
state        enabled        enabled         
certificate  self-signed    self-signed 
is-running   yes                                    
version      4.13.0-3000-2

Set command:

Copy
Copied!

            
            nvos@switch:~$ nv set system gnmi-server state <enabled | disabled>

Unset command:

Copy
Copied!

            
            nvos@switch:~$ nv unset system gnmi-server state

The state is enabled by default and the unset command will restore the state to enabled, if it is not already.

Supported Subscription Modes

NVOS supports the following gNMI subscription modes:

STREAM Mode: In this mode, the client subscribes to receive updates whenever there is a change in the telemetry data. This mode is suitable for scenarios where you need real-time notifications of data changes.
ONCE Mode: This mode retrieves the data once and then terminates the subscription. It's ideal for scenarios where a single snapshot of the data is needed without ongoing updates.
POLL Mode: In this mode, the client periodically requests data from the server. This mode allows clients to fetch data at defined intervals, providing a balance between real-time and scheduled updates.

Supported stream modes:

ON_CHANGE—When a subscription is defined to be "on change", data updates are only sent when the value of the data item changes.
SAMPLE —This mode allows clients to receive periodic samples of telemetry data at specified intervals. This mode is beneficial for scenarios where continuous streaming of data is not necessary, but periodic updates are required for monitoring and analytics.

Key Parameters for STREAM SAMPLE Mode:

sample_interval (mandatory): Defines the interval at which samples are sent to the client. This parameter controls the frequency of data transmission.

suppress_redundant (optional, default false): Determines whether redundant data updates, which have not changed since the last sample, should be suppressed. This helps in reducing unnecessary data transmission and optimizing network usage.

heartbeat_interval (optional, default disabled): Specifies the interval for sending heartbeat messages to indicate that the connection is still active. Heartbeats help in monitoring the health of the connection and detecting failures.

Models Overview

The NVOS gNMI Model is based on OpenConfig YANG models, extended with NVIDIA-specific augments where required.

It provides a consistent, vendor-neutral telemetry structure while allowing NVIDIA to expose additional InfiniBand, platform, and diagnostic data.

The gNMI YANG models consist of:

Standard OpenConfig models (baseline support)
NVIDIA Models (NVOS-specific enrichment)
Legacy NVIDIA models retained for backward compatibility

OpenConfig Supported Models

Model	Supported Data
openconfig-interfaces	Base interface configuration, state, and counters: Name, Description, AdminStatus, OperStatus, Enabled, IfIndex, LoopbackMode, and base interface counters (InPkts, OutPkts, InOctets, OutOctets, InUnicastPkts, OutUnicastPkts, InMulticastPkts, OutMulticastPkts, InBroadcastPkts, OutBroadcastPkts, InDiscards, OutDiscards, InErrors, OutErrors), plus InfiniBand-specific interface state (IBSpeed, Speed, IBSubnet, LogicalPortState, PhysicalPortState, MaintenanceState, MTU, MaxSupportedMTUs, SupportedIBSpeeds, SupportedWidths, VLCapabilities, OperationalVL) and InfiniBand port counters (SymbolErrorCounter, XmitWait, RcvErrors, RcvRemotePhyErrors, RcvSwitchRelayErrors, LocalLinkIntegrityErrors, ExcessiveBufferOverrun, LinkErrorRecovery, LinkDowned, QP1Dropped, VL15Dropped and related IB statistics).
openconfig-system	System identity, software, and resource usage: Hostname, BootTime, SoftwareVersion, Location, Contact, RoutingMAC, CPU utilization (aggregate Total/Average), and system memory usage (Physical, Used).
openconfig-platform	Chassis, ASIC, PSU, fan, storage, and other hardware inventory: Component Name, Type, Description, ModelName, PartNo, SerialNo, FirmwareVersion, OperStatus, Temperature, plus component-specific data for fans (Speed, Status), PSUs (Enabled, InputVoltage, InputCurrent, OutputVoltage, OutputCurrent, OutputPower, Status), ASICs (Name, Temperature), chassis/switch (SerialNo, ModelName, PartNo, OperStatus), storage (TotalSize), and platform health (Health Status, LastUnhealthy, UnhealthyCount).
openconfig-platform-transceiver	Optical transceiver module and channel monitoring: module presence and identity (Present, FormFactor, VendorPart, SerialNo), electrical and thermal telemetry (SupplyVoltage, LaserTemperature, module temperature thresholds – Lower, Upper), per-channel optical DOM data (InputPower, OutputPower, LaserBiasCurrent), and per-channel / host-lane status flags (RxCDRLoL, RxLOS, TxCDRLoL, TxLOS, TxFault, TxAdEqFault) and module temperature / voltage alarm flags.
openconfig-platform-healthz	Component health status and history: Status, LastUnhealthy, UnhealthyCount.

NVIDIA Models

These models extend OpenConfig to expose NVIDIA-specific telemetry that is not covered by the base OpenConfig schemas.

Model	Supported Data
nvidia-interfaces-infiniband	InfiniBand-specific interface configuration and state: IBSpeed, Speed, IBSubnet, LogicalPortState, PhysicalPortState, MaintenanceState, MTU, MaxSupportedMTUs, SupportedIBSpeeds, SupportedWidths, VLCapabilities, OperationalVL, SpeedNegotiate and related InfiniBand admin/oper fields.
nvidia-interfaces-infiniband- errors-ext	InfiniBand-specific error and status counters: ExcessiveBufferOverrun, LinkErrorRecovery, LinkDowned, LocalLinkIntegrityErrors, RcvErrors, RcvRemotePhyErrors, RcvSwitchRelayErrors, QP1Dropped, VL15Dropped and similar InfiniBand-specific port error counters.
nvidia-system-augments	NVIDIA-specific system metadata: system Location and Contact, plus other NVIDIA system-level extensions modeled as augments to the openconfig-system tree (superseding the legacy platform-general location/contact fields).
nvidia-system-events	Structured system event reporting: EventId, TypeId, Text, Resource, Severity, TimeCreated.
nvidia-if-phy-augments	Enhanced physical-layer diagnostics and BER/FEC telemetry: general PHY and BER state (TimeSinceLastClear, EffectiveErrors, ReceivedBits, SymbolErrors, RawBER, EffectiveBER, SymbolBER, ProfileFECInUse, ZeroHist), per-lane BER and error counters (per-channel RawBER and RawErrors), RS histogram bins (RSCorrectedError counters), link-down statistics (TotalEvents, IntentionalEvents, UnintentionalEvents) and reasons (Local/Remote reason code and status), recovery statistics (LastLogicRecoveryAttempts, LastSerdesEqRecoveryAttempts, TimeBetweenLastTwoRecoveries, TimeInLastLogicRecoveryEvent, TimeInLastSerdesEqRecoveryEvent, TimeSinceLastRecovery, TotalSuccessfulRecoveryEvents), PLR metrics (PLR_BW_LossPercent, PLR_CodesLoss, PLR_RcvCodes, PLR_RcvCodeErr, PLR_RcvUncorrectableCode, PLR_SyncEvents, PLR_XmitCodes, PLR_XmitRetryCodes, PLR_XmitRetryEvents, PLR_XmitRetryEventsWithinTsecMax), and InfiniBand port error and port statistic counters (PortBufferOverrunErrors, PortDLIDMappingErrors, PortInactiveDiscards, PortLocalPhysicalErrors, PortLoopingErrors, PortMalformedPacketErrors, PortNeighborMTUDiscards, PortVLMappingErrors, PortRcvData, PortRcvPkts, PortUnicastRcvPkts, PortUnicastXmitPkts, PortMulticastRcvPkts, PortMulticastXmitPkts, PortXmitData, PortXmitPkts, RQGeneralError, SyncHeaderErrorCounter).
nvidia-platform-integrated-circuit-augments	ASIC power telemetry over standard integrated-circuit model: LongTermAvgPower, ShortTermAvgPower (average power values per monitoring interval on ASIC integrated-circuit power).
nvidia-platform-storage- augments	Switch-local storage utilization: TotalSize for the logical switch storage device
nvidia-platform-transceiver- augments	Transceiver firmware and alarm model: DataPathFirmwareFault, ModuleFirmwareFault, ModuleErrorType and generic alarm state (AlarmStatus, AlarmSeverity, AlarmThreshold) for module temperature and supply voltage, and for channel InputPower, OutputPower and LaserBiasCurrent (replacing legacy module/channel-specific alarm flags).

Legacy NVIDIA Models

NVOS exposes a set of legacy NVIDIA YANG models for backward compatibility.

These models exist only to support deprecated gNMI xpaths. All data is available through the Model above, and these models are planned for removal in a future NVOS release.

Model	Supported Data (legacy)
nvidia-platform-general-ext	Legacy platform-wide system and resource information: Contact, Location, NOSVersion, PlatformName, MemoryTotalSize, MemoryUsed, DiskTotalSize, DiskUsed, AmbientTemperature and LeakSensor Id/State.
nvidia-platform-general- ext-versions	Legacy system component firmware inventory: FWVersionBIOS, FWVersionBMC, FWVersionFPGA, FWVersionEROT and FWVersionCPLD / FWVersionSMA entries (per-id version and id).
nvidia-platform-asic	Legacy ASIC-specific telemetry model: ASICName, ASICTemp, LongTermAvgPower, ShortTermAvgPower.
nvidia-if-phy-diag	Legacy PHY diagnostic model: CableProtoCapExt, CoreToPhyLinkProtoEnabled, CoreToPhyLinkWidthEnabled, ETH-AN/IB-PHY/PD/PHY-HST/PHY-Manager FSM and link mode fields, LoopbackMode, FECModeRequest, ProfileFECInUse, EffectiveBER, RawBER, SymbolBER, EffectiveErrors, PhyReceivedBits, SymbolErrors, RS histogram bins (RS_Num_Corr_Err_Bin0–Bin15), PLR_* metrics, InfiniBand port-errors and port-statistics counters, link-down and recovery metrics (LinkDown, IntentionalLinkDownEvents, UnintentionalLinkDownEvents, LinkDownReasonCode/Status Local/Remote, TimeSinceLastClear, TimeBetweenLastTwoRecoveries, TimeInLastLogic/ SerdesEqRecoveryEvent, TimeSinceLastRecovery, TotalSuccessfulRecoveryEvents, ZeroHist), and related PHY diagnostic leaves.
nvidia-platform-transceiver-diag	Legacy transceiver diagnostics model: ModuleOperStatus, DataPathFirmwareFault, ModuleFirmwareFault, ModuleErrorType, module TemperatureHigh/Low Alarm and Warning flags, VccHigh/Low Alarm and Warning flags, and channel-level flags for TxAdEqFault, TxFault, TxCDRLoL, TxLOS, RxCDRLoL, RxLOS.

YANG Model Availability

The YANG models above are available on the NVIDIA Enterprise Support Portal → Downloads → Switches and Gateways → Switch Software → QM-3 NVOS InfiniBand → More files.

NVOS YANG Package Structure

The NVOS YANG package is provided as a tar archive with the following structure:

Copy
Copied!

            
            models/
  ietf                      IETF standard base YANG models
  openconfig                OpenConfig models with NVIDIA Model augments
  nvos                      NVOS-specific OpenConfig augments kept for legacy
                            backward compatibility
  not-supported             Deviation modules that mark non-supported leaves and
                            nodes in the models above
  gnmi-supported-paths.html Reference list of all gNMI-supported paths in this
                            release

gNMI Connection and Rate Limiting

The gNMI service enforces limitations on the number of active and incoming gRPC connections to ensure system stability and optimal resource usage.

Maximum Established Connections:
The gNMI server supports a maximum of 10 concurrently established gRPC connections at any given time. Once this limit is reached, new connection attempts will be rejected until at least one of the existing connections is terminated.
Source IP–Based Rate Limiting:
The gNMI server allows up to 10 concurrent TCP connections from the same source IP address. If additional connection requests are initiated from that IP while the limit is reached, those connection attempts will be dropped automatically. The new connections will only be accepted when the number of active TCP sessions from that IP drops below the configured threshold.
To enhance the security of gNMI communications, it is strongly recommended to implement mutual TLS (mTLS) authentication together with SPIFFE (Secure Production Identity Framework For Everyone):
- Mutual TLS (mTLS): Ensures that both client and server authenticate each other using trusted X.509 certificates, thereby preventing unauthorized access and man‑in‑the‑middle attacks.
- SPIFFE Integration: Leverages SPIFFE IDs to provide consistent, identity-based authentication and authorization across services. This minimizes dependence on static credentials and simplifies certificate management.

gNMI Client Requests

gNMI client on a host can request capabilities and data from the switch. The examples below use the gNMIc client.

The following example shows a gNMIc STREAM SAMPLE mode request for specific Interface data, with a sample interval of 30 seconds, suppress redundant flag enabled, and heartbeat interval of 120 seconds:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "interfaces"  --path "/interface[name=sw1p1]"  --target nvos -u admin -p ***** --mode stream --stream-mode sample --sample-interval 30s --suppress-redundant --heartbeat-interval 120s

The following example shows a gNMIc STREAM ON-CHANGE mode request for system events, with an updates-only flag enabled:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "/system-events"  --path "" --target nvos -u admin -p ***** --mode stream --stream-mode on-change --updates-only

The following example shows a gNMIc ONCE mode request and server response for IB interface MTU (-d for debug mode):

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "interfaces"  --path "/interface[name=sw1p1]/infiniband/state/mtu" -d --target nvos -u admin -p ***** --mode once
{
  "source": "IP",
  "subscription-name": "default-1709707931",
  "timestamp": 1709707925858795109,
  "time": "2024-03-06T08:52:05.858795109+02:00",
  "prefix": "interfaces/interface[name=sw1p1]",
  "target": "nvos",
  "updates": [
    {
      "Path": "infiniband/state/mtu",
      "values": {
        "infiniband/state/mtu": 256
      }
    }
  ]
}

The following example shows a gNMIc ONCE request for all supported paths:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "/"  --path "" --target nvos -u admin -p ***** --mode once

The following example shows a gNMIc POLL mode request and server response for FAN1/1 speed:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "components"  --path "component[name=FAN1/1]/fan/state/speed"  --target nvos -u admin -p *****  --format flat --mode poll
components/component[name=FAN1/1]/fan/state/speed: 33

The following example shows a gNMIc STREAM mode request for specific system-event "text" leaf with PROTO encoding:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "system-events"  --path "system-event[event-id=38]/state/text"  --target nvos -u admin -p ***** --encoding proto --format prototext --mode stream
 
sync_response:  true
 
update:  {
  timestamp:  1719295967820127958
  prefix:  {
    elem:  {
      name:  "system-events"
    }
    elem:  {
      name:  "system-event"
      key:  {
        key:  "event-id"
        value:  "38"
      }
    }
    target:  "nvos"
  }
  update:  {
    path:  {
      elem:  {
        name:  "state"
      }
      elem:  {
        name:  "text"
      }
    }
    val:  {
      string_val:  "Interface admin state is up"
    }
  }
}

A list of supported events can be found in the Event Management page.

The following example shows a gRPC curl command to describe the server using gRPC reflection service:

Copy
Copied!

            
            docker run fullstorydev/grpcurl -H username:admin -H password:***** -insecure "IP":9339 describe
 
gnmi.gNMI is a service:
service gNMI {
  rpc Capabilities ( .gnmi.CapabilityRequest ) returns ( .gnmi.CapabilityResponse );
  rpc Get ( .gnmi.GetRequest ) returns ( .gnmi.GetResponse );
  rpc Set ( .gnmi.SetRequest ) returns ( .gnmi.SetResponse );
  rpc Subscribe ( stream .gnmi.SubscribeRequest ) returns ( stream .gnmi.SubscribeResponse );
}
grpc.reflection.v1.ServerReflection is a service:
service ServerReflection {
  rpc ServerReflectionInfo ( stream .grpc.reflection.v1.ServerReflectionRequest ) returns ( stream .grpc.reflection.v1.ServerReflectionResponse );
}
grpc.reflection.v1alpha.ServerReflection is a service:
service ServerReflection {
  rpc ServerReflectionInfo ( stream .grpc.reflection.v1alpha.ServerReflectionRequest ) returns ( stream .grpc.reflection.v1alpha.ServerReflectionResponse );
}

The following example shows a gNMIc ONCE mode request for all the supported paths:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify subscribe --prefix "/"  --path ""  --target nvos -u admin -p ***** --mode once --format flat

The following example shows a gNMIc Capabilities request to retrieve the set of capabilities that is supported by the server:

Copy
Copied!

            
            gnmic -a "IP" --port 9339 --skip-verify capabilities -u admin -p *****

Related Information

gNMI Streaming Commands

gNMI Streaming Commands

On This Page