What can I help you with?
NMX Telemetry (NMX-T) Documentation v1.0.0

Prometheus Metrics Endpoint

NMX Telemetry provides an HTTP endpoint for seamless integration with monitoring systems that operate in poll mode and support Prometheus, CSV, or JSON data formats. The endpoint only returns the most recent data sample, and users cannot access statistics for past time points.

Copy
Copied!
            

curl --silent  http://0.0.0.0:9352/metrics

Prometheus interface port is defined by the Interfaces.

Prometheus interface security configuration is handled with Interface Configuration

By default, the metrics endpoint provides data in Prometheus format; however, it also supports rendering data in CSV and JSON formats to help manage convenience and payload size. The rendering format is controlled by the csv and json path prefixes.

Get metrics as comma-separated values:

Copy
Copied!
            

curl --silent  http://0.0.0.0:9352/csv/metrics

Get metrics as JSON objects:

Copy
Copied!
            

curl --silent  http://0.0.0.0:9352/json/metrics

An HTTP endpoint can deliver all sampled data via the default /metrics URL.

If no URL prefix is specified, the filter file will be searched in both the cset and fset folders. If both contain files with the same name, both filters will be applied.

Counter Sets (cset)

Cset file contains tokens per line to filter the data with "type"="counters".

Copy
Copied!
            

# List of available counters: node_guid port_guid port_num lid link_down_counter link_error_recovery_counter symbol_error_counter port_rcv_remote_physical_errors port_rcv_errors port_xmit_discard port_rcv_switch_relay_errors excessive_buffer_errors ...

Tokens are the actual name 'fragments' to be matched:

  • port$: Matches names that end with the token "port."

  • ^port: Matches names that start with the token "port."

  • ^port$: Matches names that are exactly "port."

  • port+xmit: Matches names that contain both the tokens "port" and "xmit."

  • port-support: Matches names that contain the token "port" but exclude those with the token "support."

  • -port: Excludes names that contain the token "port."

Tip

To disable counter export, insert a single-line token that doesn't match anything.


Field Sets (fset)

Fset consists of multiple blocks, each beginning with a header line in the format [event_type_name], followed by tokens under that header. The Fset file is used to filter data with "type"="events". Event type names can be prefixed to apply the same tokens to all matching types. For example, to filter all ethtool events, use [ethtool_event_*].

Copy
Copied!
            

[type_name_1] tokens [type_name_2] tokens [type_name_3] tokens ...

Tokens are the actual the name 'fragments' to be matched:

  • port$: Matches names ending with the token "port."

  • ^port: Matches names starting with the token "port."

  • ^port$: Matches names that are exactly "port."

  • port+xmit: Matches names containing both the tokens "port" and "xmit."

  • port-support: Matches names containing the token "port" but excluding those that also contain the token "support."

  • -port: Excludes all names containing the token "port."

To match multiple tokens simultaneously, use the format "tok1+tok2+tok3". Exclusive tokens are also supported: for example, the line "tok1+tok2-tok3-tok4" will filter names that match both tok1 and tok2, while excluding those that match tok3 or tok4.

Meta fields are user-defined additional fields, which come in two types: aliases and new constant fields.

  • Aliases

    • Add the data from the field "exact_name" to the meta fields of the record under the new "alias_name."

    • Each field can have only one alias.

    • Aliases match only exact names and will appear in the data record, even if the field is disabled by the fset.

      Example:

Copy
Copied!
            

meta_field_alias:exact_name=alias_name

  • Constants

    1. Add a new field called "new_field_name" with the constant data string "constant_value" to the meta fields.

    2. Field names must be unique.

      Example:

      Copy
      Copied!
                  

      meta_field_add:new_field_name=constant_value

The following example will export all "switch_fan" events and "CableInfo" events filtered by the token "port":

Copy
Copied!
            

[switch_fan]   [CableInfo] port

To know which event type names are available use NVL5 Metrics Schema.

Corner Cases

  • An empty fset file will export all events.

  • Tokens written above or without an [event_type] will be ignored.

  • If the fset file cannot be opened, a warning will be displayed, and all event types will be exported.

Both events and counters can be extended with aliased fields and new constant fields.

  • “meta_field_aliases:exact_name=alias” will add a new field or counter with the name “alias_name” and copy the value from the existing field or counter “exact_name.”

  • “meta_field_add:new_name=constant_value” will add a new field or counter with the name “new_name” and the value “constant_value.”

New fields must have unique names; otherwise, they will be ignored.

Extended Counter Sets

The HTTP server offers an optional Extended Counter Set (xcset) selection mechanism in addition to the counter set (cset) and field set (fset) filtering. The Extended Counter Set enables users to generate an output record containing data from both "counters" and "event" data records with the same index, typically the guid/port_num in the context of NMX Telemetry. To define an extended counter set, a file or group of files with the .xcset extension must be placed in the designated directory or alongside the existing field or counter sets.

Each line of the file may include:

  • Selection of a counter with an optional alias in the format “counter[=alias]”

  • Selection of a type's field with an optional alias in the format “type.field[=alias]”

  • Reference to another file to be included in the format “file.xcset”

Extended counter set files are searched in the same directory as the source xcset.

Aliases are optional, but if provided, they will be used to name the selected counter or field in the output. Empty lines and comments (starting with "#") are ignored.

Extended counter sets support rendering hints to modify the attribution and representation of metric values.

These hints are provided as a comma-separated list of key=value pairs, placed after the field selection line, following the semicolon (;) character.

Copy
Copied!
            

counter[=alias];key[=value][,key[=value]]*

For example:

Copy
Copied!
            

port_guid;label,hex,default=undefined hw_port_state;lookup=printable_port_states

Supported rendering hints are the following:

Key

Value

Description

Example

hex

n/a

Requests a numeric value to be rendered hexadecimal

port_num;hex

label

n/a

Attributes the field as Prometheus label

host_name;label

default

value

Sets a string value to be rendered in case of data for the field is missing

temperature;default=unknown

const

value

Add the marked field as constant value to the output

context;const=oberon

lookup

name

Use the named lookup table to replace the value when rendering.

hw_port_state;lookup=printable_port_states

Extended counter sets support value replacement using lookup tables.

One or more lookup tables can be defined separately or as part of the xcset file. The location of the lookup table is the same for all xcset files.

Lookup Table Definition

Copy
Copied!
            

Name

Required

/Optional

Type

Description

lookup

Required

Keyword

Declared lookup element.

mask

Optional

Keyword

Declaration value as a mask. If not present value is exact.

name

Required

string

Field name.

key

Required

unsigned long long

The original value for replacement

value

Optional

string

String for replace key. If not present will show the original value.

Examples lookup definition:

Copy
Copied!
            

lookup:link_speed_active:0:  UNKNOWN lookup:link_speed_active:1:  SDR lookup:link_speed_active:2:  DDR   lookup:CableInfo.cable_vendor:1:Oth   lookup:mask:fastRecoveryOverFlow:1:  num_errors lookup:mask:fastRecoveryOverFlow:32: consecutive_normal


Lookup Value Usage

Using

Example

Description

implicit

CableInfo.cable_vendor=cable_vendor

Value 1 will be replaced by Oth.

The lookup key should be equal to the field name.

implicit

fastRecoveryOverFlow=fastRecoveryOverFlow

Value 1 will be replaced by num_errors

Value 33 will be replaced by num_errors,consecutive_normal

explicit in xcset

CableInfo.cable_vendor=cable_vendor;lookup=

Disable lookup for CableInfo.cable_vendor.

explicit in xcset

lookup:hello:1:hello world

CableInfo.cable_vendor=cable_vendor;lookup=hello

Value 1 will be replaced by 'hello world'.

explicit in xcset

CableInfo.cable_vendor=cable_vendor;lookup=vendor

lookup:vendor:1:

Value 1 will be 1.

Output result in Prometheus without lookup:

Copy
Copied!
            

hw_port_state{hca="mlx5_2"}  1 1716905830122 hw_port_state{hca="mlx5_2"2 1716905830122

Prometheus

CSV

JSON

hw_port_state{hca="mlx5_2"} 1 100500

rx_bytes{hca="mlx5_2"} 100 100700

hw_port_state{hca="mlx5_2"} 2 100500

rx_bytes{hca="mlx5_2"} 150 100700

timestamp,hca,hw_port_state,rx_bytes

100500,mlx5_2,1,100

100700,mlx5_2,2,150

{"timestamp": 100500, "hca": "mlx5_2", "hw_port_state": 1, "rx_bytes": 100},

{"timestamp": 100700, "hca": "mlx5_2", "hw_port_state": 2, "rx_bytes": 150},

with lookup

Copy
Copied!
            

lookup:hw_port_state:1:Active

String as label

Prometheus

CSV

JSON

false

rx_bytes{hca="mlx5_2"} 100 100700

rx_bytes{hca="mlx5_2"} 150 100700

timestamp,hca,hw_port_state,rx_bytes

100500,mlx5_2,Active,100

100700,mlx5_2,2,150

{"timestamp": 100500, "hca": "mlx5_2", "hw_port_state": "Active", "rx_bytes": 100},

{"timestamp": 100700, "hca": "mlx5_2", "hw_port_state": "2", "rx_bytes": 150}

true

rx_bytes{hca="mlx5_2", hw_port_state="Active"} 100 100500

rx_bytes{hca="mlx5_2", hw_port_state="2"} 150 100700

timestamp,hca,hw_port_state,rx_bytes

100500,mlx5_2,Active,100

100700,mlx5_2,2,150

{"timestamp": 100500, "hca": "mlx5_2", "hw_port_state": "Active", "rx_bytes": 100},

{"timestamp": 100700, "hca": "mlx5_2", "hw_port_state": "2","rx_bytes": 150}


The NMX Telemetry Prometheus endpoint offers data filtering capabilities to control the selection of metrics it outputs.

Filter operations and operands are provided as HTTP query string parameters. Multiple filters in a single HTTP request are combined using a logical AND.

The general format of the filter query parameter is "<operation>", where:

  • field-name: The name of the counter or event field to which the operation applies.

  • operation: One of the operations from the list below.

  • operands: One or more operands used to evaluate the filter.

For example:

Copy
Copied!
            

curl --silent  http://0.0.0.0:9302/metrics?guid__eq__100500

Supported filters are:

Operation

Description

Applies to the field of type

Example

eq

Metrics value is equal to the given operand

floating point, decimal, string

xmit_rate__gt__10000

ne

Metrics value is not equal to the given operand

floating point, decimal, string

xmit_rate__ne__10000

gt

Greater than the given operand

floating point, decimal, string

xmit_rate__gt__10000

lt

Less than the given operand

floating point, decimal, string

xmit_rate__lt__10000

ge

Greater than or equal to the given operand

floating point, decimal, string

xmit_rate__ge__10000

le

Less than or equal to the given operand

floating point, decimal, string

xmit_rate__le__10000

bitand

Bitwise AND operation

decimal

state__bitand__7

bitor

Bitwise OR operation

decimal

state__bitor__13

in

Metrics value is in the list of given values

floating point, decimal, string

state__in__1__2__3

shard

Apply hashing function to get shard N out of K possible

floating point, decimal, string

port_guid__shard__1__3

contains

String value contains a value (substring)

string

name__contains__mlx

The shard data filter is particularly useful when metrics scraping loads need to be distributed across time or consumer spaces. Several examples of sharding queries are provided.

  • Sharding of counters and events by the node GUID, serializing to csv.

    Copy
    Copied!
                

    curl -v  http://0.0.0.0:9352/csv/metrics?num_shards=2&shard=0&sharding_field=node_guid

  • Sharding, plus filtering by port number.

    Copy
    Copied!
                

    curl -v  http://0.0.0.0:9352/csv/metrics?num_shards=2&shard=0&sharding_field=node_guid&port_num__eq__1

  • Counter set explicitly selected.

    Copy
    Copied!
                

    curl -v  http://0.0.0.0:9352/csv/cset/minimal?num_shards=2&shard=0&sharding_field=node_guid

  • Fieldset, selected sharding by the port number (named as “port”).

    Copy
    Copied!
                

    curl -v  http://0.0.0.0:9352/csv/fset/low_freq?num_shards=2&shard=0&sharding_field=port

© Copyright 2025, NVIDIA. Last updated on Apr 23, 2025.