This document outlines a software reference guide intended to help NVIDIA Cloud Partners (NCPs), Cloud Service Providers (CSPs), and Independent Software Vendors (ISVs) build AI cloud services on NCP hardware platforms. It presents an infrastructure-native Northstar reference that supports multi-tenancy and elastic resource allocation.
As inference workloads grow in importance, organizations increasingly require fungible compute resources that can dynamically satisfy multi-tenant training and inference workloads. This shift has driven the need for a cloud-native approach to building and operating AI service stacks.
This document describes how to build such a solution using Kubernetes and modern AI Platform-as-a-Service (PaaS) offerings. The document also provides guidance on optimizing the performance of AI inference and training within a virtualized environment, where workloads can run on either shared or dedicated physical hosts.
Built on top of the NCP Hardware Reference Design, the architecture stack follows a layered approach:
The layered design enables NCPs to deliver dynamic, multi-tenant AI services competitive with hyperscale cloud service providers. The architecture is infrastructure-native; compute, storage, and networking resources are allocated in an on-demand model rather than statically provisioned.
NCPs can work with an ecosystem of integrated services vendors (ISVs) or implement open-source tools to deliver security and workload isolation. This enables the operation of a concurrent multitenant private cloud that leverages the performance-optimized stack outlined in this reference architecture. NVIDIA certifies the performance of third-party solutions so that NCPs can confidently choose their partner of choice.
The document is organized into two sections:
This document is a reference guide, not an implementation manual. It describes capabilities required at each layer and identifies where software provided by NVIDIA may be integrated.
User personas listed in the table below are the roles that interact with, administer, or use the system implemented using the NCP Software Reference Guide.
User Personas
Terms and definitions in this document are defined in the table below.
Terms and Definitions