NCPs should be able to demonstrate ability to meet below SLA by category and operational requirements to be considered for offtake.
The NCP must demonstrate API readiness, transport establishment at least 12 weeks ahead of GPU delivery, and the ability to provide Dev capacity (CPU nodes only) with the API integrated 6 weeks prior to GPU and cluster delivery.
One key request is for early access to ancillary compute nodes to act as the Data Mover function. This will help us pre-position data into the data center for use when GPUs are available. Access to Data Mover compute (and target storage) should be available ~2 weeks ahead of GPU cluster delivery.
NCP shall deliver all required telemetry, including metrics and logs, in a manner that allows for ingestion into DGX Cloud systems. The preferred methodology is natively via the OpenTelemetry Protocol with a latency of no longer than 120 seconds.
NVIDIA Exemplar Cloud seeks to improve performance per TCO with hardware and software recipes, references, tools, and capabilities. Run the latest publicly available release from https://github.com/NVIDIA/dgxc-benchmarking (Always pick the latest release version from the GH repo) to be successfully completed on 1 uniform HW cluster type. Please run all the workloads for a given release and share the results in the template below.