gRPC Interface
The NMX-T instance runs a gRPC server that allows clients to retrieve application information and subscribe to telemetry data. The full gRPC interface prototype definition, nmx-telemetry.proto, can be found in the ./proto subdirectory of the package installation directory.
            
            service TelemetryService {
  rpc Hello(ClientHello) returns (ServerHello);
  rpc SubscribeTelemetryData(TelemetrySubscription) returns (stream TelemetryData);
}
    The gRPC interface is optionally secured with TLS and mTLS. By default gRPC interface runs unsecured.
- disabled - no security communication enforced 
- tls - TLS encryption enforced, where the gRPC interface trust could be verified by the client 
- mtls - mutual TLS enforced, where the gRPC server also checks the trust of a connected client 
The gRPC interface can be enabled or disabled. By default, it is enabled.
The parameter nmx-telemetry-grpc-interface controls the interface's on/off state in the user_config.json file.
The "Hello" remote procedure call is used to synchronize the client and server versions, and if needed, enforce version matching and adjust the logic accordingly.
            
            service TelemetryService {
  rpc Hello(ClientHello) returns (ServerHello);
}
    
Client parameters to the handshake
            
            message ClientHello {
    string gatewayId = 1;
    ProtoMsgMajorVersion major_version = 2;
    ProtoMsgMinorVersion minor_version = 3;
}
    
In addition to other application-specific data, the telemetry service returns the application instance and environment identifiers.
- domain_uuidenvironment domain identifier, unique identifier of the GB200 instance
- app_uuidApplication instance unique identifier
- app_verApplication version string
The Remote Procedure Call SubscribeTelemetryData enables clients to receive a stream of telemetry data collected by NMX-Telemetry.
            
            service TelemetryService {
  rpc SubscribeTelemetryData(TelemetrySubscription) returns (stream TelemetryData);
}
    
Message TelemetrySubscription defines subscription parameters.
            
            message TelemetrySubscription {
  string data_type = 1;  // * | ib_counters | sys_log | gpu_counters
  string source_id = 2;
  string source_tag = 3;
}
    
Set the parameter values to select the types or sources of data to receive, or leave the values blank to subscribe to all available data.
- data_typeType of the data to subscribe for- empty string or asterisk * to subscribe for all the data types 
- comma-separated list of data types for a fine-grained subscription 
 
- source_iddata source identifier to get data from
- source_tagdata source tag
Leave all the parameters empty to receive all telemetry data as it is collected, without any filtering or pre-selection.
The telemetry data response includes metadata fields and the actual data payload. The format of the payload may vary depending on the type of data received.
            
            message TelemetryData {
  string aggregator_id = 1;
  string source_id = 2;
  string source_tag = 3;
  string data_type = 4;
  int64 timestamp = 6;
  Encoding encoding_type = 7;
  bytes message = 8;
}
    
Metadata fields describe the payload
- aggregator_id - the unique identifier of the application domain (Oberon domain UUID) 
- data_type - a name of the type of data the payload contains, for example "counters" 
- soruce_id - identifier of the data source - device guid for the NVLink telemetry counters, switch ip and port for the gNMI aggregation, server ip for the syslog message aggregation 
- timestamp - moment of time the message has been formed, in microseconds 
- encoding_type - a hint to interpret the payload, could be JSON or BYTES 
- message - is the actual data payload, as described in the section below 
For example a message representing an event of type nvl_packet_types_counters may have the following values:
            
            aggregator_id = b954ce10-be66-4d75-a538-405ac8517c38
data_type = nvl_packet_types_counters
source_id = 0x1070fd030058c216
source_tag = nvlink
    
Telemetry data, including counters and events, is presented as comma-separated values (CSV) enclosed within a JSON format.
The JSON object consists of
- Timestamp: The time at which the data is collected. 
- Fields: A comma-separated list of data fields contained in the payload. 
- Values: A list of strings, each representing a list of values corresponding to the respective fields. 
Message payload of data type nvl_packet_types_counters may look like the following:
            
            [
    {
        "timestamp": 100,
        "fields": "node_guid,port_guid,port_num,port_rcv_ibg1_nvl_pkts,port_rcv_ibg1_non_nvl_pkts,port_rcv_ibg2_pkts,port_xmit_ibg1_nvl_pkts,port_xmit_ibg1_non_nvl_pkts,port_xmit_ibg2_pkts",
        "values": [
            "0x1070fd0300580000,0x1070fd030058c216,9,0,0,0,0,0,0",
            "0x1070fd0300580002,0x1070fd030058c216,9,0,0,0,0,0,0"
        ]
    },
    {
        "timestamp": 200,
        "fields": "node_guid,port_guid,port_num,port_rcv_ibg1_nvl_pkts,port_rcv_ibg1_non_nvl_pkts,port_rcv_ibg2_pkts,port_xmit_ibg1_nvl_pkts,port_xmit_ibg1_non_nvl_pkts,port_xmit_ibg2_pkts",
        "values": [
            "0x1070fd0300580000,0x1070fd030058c216,9,0,0,0,0,0,0",
            "0x1070fd0300580002,0x1070fd030058c216,9,0,0,0,0,0,0"
       ]
    }
]
    
Another example, the data payload of the "counters" data type:
            
            [
    {
        "timestamp": 1729872473718869,
        "fields": "node_guid,port_guid,port_num,node_description,roundtrip_time_port_counters_extended",
        "values": [
            "0xb83fd20300f9b7dc,0xb83fd20300f9b7dc,1,swx-proton03-bf3-2 HCA-1,,0"
        ]
    }
]
    
The TelemetryData response that is a result of the gNMI Aggregated Data consists of the following:
- aggregator_id: The unique identifier for the application domain (Oberon domain UUID). 
- data_type: The name of the gNMI subscription. 
- source_id: The address and port of the gNMI target from which the data is being aggregated. 
- timestamp: The time, in microseconds, when the message was formed. 
- encoding_type: A hint for interpreting the payload, which could be either JSON or PROTO. 
- message: The gNMI update response received from the aggregation target, either in its original binary form (encoded in PROTO) or as a JSON representation of the gNMI update message. 
For example a JSON-marshalled gNMI response could look like the following:
            
            {
    "update": {
        "prefix": {
            "elem": [
                {
                    "name": "interfaces"
                },
                {
                    "key": {
                        "name": "fnma1p1"
                    },
                    "name": "interface"
                }
            ],
            "target": "netq"
        },
        "timestamp": "1729513043599315230",
        "update": [
            {
                "path": {
                    "elem": [
                        {
                            "name": "state"
                        },
                        {
                            "name": "counters"
                        },
                        {
                            "name": "in-octets"
                        }
                    ]
                },
                "val": {
                    "uintVal": "353952"
                }
            }
        ]
    }
}
    
The TelemetryData response that is a result of the syslog collection consists of the following:
- aggregator_id: The unique identifier for the application domain (Oberon domain UUID). 
- data_type: The value "log_message". 
- source_id: The address and port of the log message's source. 
- source_tag: The name of the process that sent the log message. 
- timestamp: The time, in microseconds, when the message was generated. 
- encoding_type: The encoding format, either JSON or ASCII. 
- message: The syslog message, which may be in its original text form (encoded in BYTES) or a JSON-serialized OpenTelemetry message. 
Example:
            
            {
    "time_unix_nano": 1731603557000000000,
    "observed_time_unix_nano": 1731596357165630000,
    "severity_number": 10,
    "severity_text": "notice",
    "body": {
        "Value": {
            "StringValue": "Nov 14 16:59:17 swx-proton04: Hey!"
        }
    },
    "attributes": [
        {
            "key": "facility",
            "value": {
                "Value": {
                    "IntValue": 1
                }
            }
        },
        {
            "key": "hostname",
            "value": {
                "Value": {
                    "StringValue": "swx-proton04"
                }
            }
        },
        {
            "key": "message",
            "value": {
                "Value": {
                    "StringValue": "Hey!"
                }
            }
        },
        {
            "key": "priority",
            "value": {
                "Value": {
                    "IntValue": 13
                }
            }
        },
        {
            "key": "appname",
            "value": {
                "Value": {
                    "StringValue": "bash"
                }
            }
        }
    ]
}