AI Platform-as-a-Service#

The NCP Software Reference is intended to run a wide range of industry AI platforms to best enable the NCP and their customer workloads. There are many examples, like Amazon SageMaker, Red Hat® AI Inference Server, NVIDIA NIM™ and NVIDIA NeMo™, SkyPilot, Slurm (and Slinky), Run:ai and more.

NVIDIA provides AI platform software for training and inference workloads.

Note

These platforms serve different deployment models. Run:ai is suited for shared Kubernetes environments with multi-team GPU scheduling. NVIDIA Cloud Functions provides serverless inference with dedicated multitenancy. Slurm is suited for single-tenant HPC and large-scale training. NCPs should select based on their tenant requirements.

For detailed descriptions, see Part2: NVIDIA Software References.