DGX SuperPOD Architecture

The DGX SuperPOD architecture is a combination of DGX systems, InfiniBand and Ethernet networking, management nodes, and storage. Figure 2 shows the rack layout of a single SU. In this example, power consumption per rack exceeds 40 kW. The rack layout can be adjusted to meet local data center requirements, such as maximum power per rack and rack layout between DGX systems and supporting equipment to meet local needs for power and cooling distribution.

Figure 2. Complete single SU rack layout

_images/superpod-h100-arch-01.png

Figure 3 shows an example management rack configuration with networking switches, management servers, storage arrays, and UFM appliances. Sizes and quantities will vary depending upon models used.

Figure 3. Management rack configuration

_images/superpod-h100-arch-02.png

This reference architecture is focused on 4 SU units with 128 DGX nodes. DGX SuperPOD can scale to much larger configurations up to and beyond 64 SU with 2000+ DGX H100 nodes. See Table 3 for more information.

Table 3. Larger SuperPOD component counts

SU Count

Node Count

GPU Count

InfiniBand Switch Counts

Cable Counts

Leaf

Spine

Core

Node-Leaf

Leaf-Spine

Spine-Core

4

128

1024

32

16

1024

1024

1024

8

256

2048

64

32

2048

2048

2048

16

512

4096

128

128

64

4096

4096

4096

32

1024

8192

256

256

128

8192

8192

8192

56

2048

16384

512

512

256

16384

16384

16384

Contact NVIDIA for more information regarding DGX SuperPOD solutions beyond four scalable units.