NVIDIA Cloud Accelerator (NCX)

dgx-cloud-diagram-nvidia-cloud-accelerator-cc

What is NVIDIA Cloud Accelerator?

NVIDIA Cloud Accelerator (NCX) is a portfolio of open, modular infrastructure software components that cloud partners can use to build and operate NVIDIA-powered AI clouds. Built from NVIDIA’s own AI factory learnings, it consists of composable building blocks across infrastructure and platform layers, including hardware lifecycle management, health monitoring, and operational automation.

Software Components
Ecosystem Partners
Reference Guides

NVIDIA Cloud Functions

NVIDIA Cloud Functions (NVCF) is a unified API layer for scaling inference and simulation workloads across one or more Kubernetes clusters.

Browse

KAI Scheduler

KAI Scheduler is a scalable Kubernetes scheduler optimized for GPU resource allocation across large-scale AI and machine learning systems.

Browse

Grove

Grove, a modular component of NVIDIA Dynamo, provides a Kubernetes API for defining and scaling multi-component AI inference workloads.

Browse

Dynamo

Dynamo is a distributed inference-serving framework built to deploy models in multi-node environments at data center scale.

Browse

NVIDIA Fleet Intelligence

NVIDIA Fleet Intelligence is a powerful agent-based managed service that offers continuous GPU health monitoring and predictive failure signals for maximum uptime and stability. Read about data center fleet management software.

Browse

NCX Infra Controller

Bare-metal provisioning and secure lifecycle management for multi-tenant GPU infrastructure.

Browse

AI Cluster Runtime

AI Cluster Runtime provides a canonical, continuously validated definition of the NVIDIA-accelerated Kubernetes runtime for reproducible AI infrastructure. Read more.

Browse

NVSentinel

Open-source, Kubernetes-native GPU monitoring and fault remediation. NVSentinel helps detect issues early and automate recovery to keep GPU fleets productive. Read about automating Kubernetes AI cluster health with NVSentinel.

Browse

NVIDIA DOCA Platform Framework (DPF)

The NVIDIA DOCA Platform Framework (DPF) is an orchestration system to build, deploy, and operate BlueField‑accelerated infrastructure services enabling partners to build secure, multi‑tenant cloud infrastructure for AI and other modern applications.

Browse

NVIDIA Project GPUd

Project GPUd is a lightweight, production-proven GPU telemetry agent. It integrates with Docker, containers, Kubernetes, and NVIDIA ecosystems, while providing a unified view of critical metrics.

Browse

NVIDIA AI Cloud-Ready ISV Validation Initiative

The NVIDIA AI Cloud-Ready ISV Validation Initiative qualifies and validates AI infrastructure and platform software from ISVs for deployment on NVIDIA Cloud Partners (NCPs).

Learn More

NVIDIA Cloud Partner Software Reference Guide

This guide is intended to provide NVIDIA Cloud Partners (NCPs), Cloud Service Providers (CSPs), and Independent Software Vendors (ISVs) with an infrastructure-native Northstar reference for building AI cloud services on NCP hardware platforms that can be operated with multi-tenancy and elastic resource allocations.

Browse

NVIDIA Cloud Partner Inference Reference Architecture

This document outlines a software architecture intended to help NVIDIA Cloud Partners (NCPs)—sometimes called operators—build a performant, cost-effective, solution for large-scale AI inference workloads. It is intended to provide NCPs and ISVs with a Northstar definition that will best serve AI practitioners and cloud operators alike.

Browse

NVIDIA Requirements for AI Clouds v2.1

These are the standards and expectations for NVIDIA Cloud Partners (NCPs) operating NVIDIA GPU-accelerated AI cloud infrastructure. They cover the full operational stack, from compute and Kubernetes to storage, networking, security, telemetry, and fleet management, and expand on the NVIDIA hardware reference design and NCP Software Reference Guide.

Browse