> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/dsx/llms.txt.
> For full documentation content, see https://docs.nvidia.com/dsx/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/dsx/_mcp/server.

# Telemetry Requirements

## Telemetry Requirements

The telemetry requirements are comprised of two core components that require alignment between DGX Cloud and the NCP:

1. **Delivery Method:** *How* telemetry will be delivered by NCP to DGX Cloud for ingestion
2. **Telemetry Scope:** *What* telemetry the NCP will deliver to DGX Cloud

**Delivery Method**\
NCP shall deliver all required telemetry, including metrics and logs, in a manner that allows for ingestion into DGX Cloud systems. The preferred methodology is natively via the OpenTelemetry Protocol with a latency of no longer than 120 seconds.

**Telemetry Scope**\
DGX Cloud will provide the NCP with a detailed specification document with the required metrics and logs. Upon receipt, the NCP shall be required to provide a formal written response detailing the following:

* Confirmation of its ability to deliver the specified metrics and logs.
* Projected timelines for delivery.
* Specific technical details, including metric names, label names, and label values.

**Network Telemetry**\
The NCP shall provide network telemetry across the following domains:

* North-South (Front-End) Network (client-facing and external interconnects)
* East-West (Back-end) Network (GPU/GPU interconnects)
* Management Network (control plane and orchestration traffic)
* NVSwitch Fabric (intra-node GPU switching, applicable for only GB200 and beyond clusters)
* Host Network (NIC-level and server connectivity)

**Logs**\
DGX Cloud will require the NCP to provide logs from various network technologies, including but not limited to:

1. Fabric Manager logs for the NVLink domain *(where applicable)*
2. Subnet Manager logs for the NVLink domain *(where applicable)*
3. VPC Flow logs (all ingress/egress traffic)
4. UFM Event logs
5. General Switch Logs
6. Switch syslogs
7. Switch kernel logs
8. BMC SEL logs
9. syslogs