Appendix B#

NVIDIA Enterprise Reference Architecture: NVIDIA H100 NVL and NVIDIA Spectrum Platforms#

The NVIDIA H100 NVL and NVIDIA Spectrum Platforms Enterprise RA is optimized for multi-node AI or hybrid applications. This modular architecture is based on NVIDIA-Certified H100 NVL systems, each equipped with four H100 NVL GPUs. Using a four-node scalable unit (SU), this can scale up to 32 NVIDIA-Certified H100 NVL systems, totaling 128 H100 NVL GPUs. Fully tested systems can scale to twenty-four SUs, with the potential for larger clusters based on customer requirements. The flexible rail-optimized end-of-row network architecture accommodates modifications in rack layout and the number of servers per rack. Hardware support is provided through the fulfillment system partner, while software support from NVIDIA is available via a per GPU paid subscription to NVIDIA AI Enterprise.

Use Cases#

AI Inference: Medium model parameter inference workloads AI Training: Small model training and fine-tuning

NVIDIA H100 NVL Reference Configurations#

The NVIDIA H100 NVL Tensor Core GPU is the most optimized platform for LLM inference, offering high compute density, high memory bandwidth, high energy efficiency, and a unique NVLink architecture. It delivers unprecedented acceleration to power the world’s highest-performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. The NVIDIA H100 NVL card is a dual-slot 10.5-inch PCI Express Gen5 card based on the NVIDIA Hopper architecture. NVIDIA-Certified H100 NVL systems are based on a common system design with flexibility for optimizing the configuration to match cluster requirements. Systems are available in 2-GPU, 4-GPU, and 8-GPU configurations. This was built using the 4-GPU pattern (2-4-3-200 CPU-GPU-NIC-Bandwidth), but the 8-GPU pattern (2-8-5-200 CPU-GPU-NIC-Bandwidth) can also be utilized based on specific needs.

Figure 7. 8-GPU NVIDIA H100 NVL system configuration