DGX SuperPOD Software#

DGX SuperPOD is an integrated hardware and software solution. The included software (Figure 13) is optimized for AI from top to bottom. From the accelerated frameworks and workflow management through to system management and low-level operating system (OS) optimizations, every part of the stack is designed to maximize the performance and value of DGX SuperPOD.

_images/image16.png

Figure 13 DGX SuperPOD high-level software architecture#

NVIDIA Mission Control#

NVIDIA Mission Control is the standard for every DGX SuperPOD with DGX B200. It streamlines AI operations, from workloads to infrastructure, with world-class expertise delivered as software powering AI data centers, bringing instant agility for inference and training while providing full-stack intelligence for infrastructure resilience. Every enterprise can run AI with hyperscale efficiency, simplifying and accelerating AI experimentation.

NVIDIA Mission Control includes NVIDIA Base Command Manager and NVIDIA Run:ai functionality as part of integrated software delivery across configuration, validation, and operations.

NVIDIA Base Command Manager#

NVIDIA Base Command Manager offers fast deployment and basic end-to-end management for heterogeneous AI and HPC clusters. It automates provisioning and administration of DGX SuperPOD from hundreds to thousands of nodes.

NVIDIA Run:ai#

NVIDIA Run:ai is cloud native AI workload and GPU orchestration platform that simplifies and accelerates AI and machine learning with DGX SuperPOD through dynamic resource allocation, comprehensive AI lifecycle support, strategic resource management and advanced scheduling. Run:ai maximizes GPU efficiency and workload capacity. Its policy engine, open architecture, and visibility into AI workloads foster strategic alignment with business objectives.

NVIDIA NGC#

NVIDIA NGC provides software to meet the needs of data scientists, developers, and researchers with various levels of AI expertise.

Software hosted on NGC undergoes scans against an aggregated set of common vulnerabilities and exposures (CVEs), crypto, and private keys.

Software from the NGC catalog is tested and ensured to scale to multiple GPUs and in some cases, to scale to multi-node, ensuring users maximize the use of their DGX SuperPOD.

NVIDIA AI Enterprise#

NVIDIA AI Enterprise is the end-to-end software platform that brings generative AI into reach for every enterprise, providing the fastest and most efficient runtime for generative AI foundation models developed with the NVIDIA DGX platform. With production-grade security, stability, and manageability, it streamlines the development of generative AI solutions. NVIDIA AI Enterprise is included with DGX SuperPOD for enterprise developers to access pretrained models, optimized frameworks, microservices, accelerated libraries, and enterprise support.