DGX SuperPOD Architecture#
The DGX SuperPOD architecture is a combination of DGX systems, Ethernet networking, InfiniBand Networking, management nodes, and storage. Figure 4 shows the rack layout of a single SU. With DGX SuperPOD with DGX RUBIN NVL8 systems, we utilize standard racks with traditional power supplies and PDUs.
In our reference design, eight DGX RUBIN NVL8 fit within a single rack. The rack-level power consumption per rack is ~225kW. The rack layout can be adjusted to meet local data center requirements, such as maximum power per rack and rack layout between DGX systems and supporting equipment to meet local needs for power and cooling distribution.
Figure 4 shows 72 NVIDIA DGX RUBIN NVL8 PS systems in standard racks each with three 2U rack PDUs for maximum redundancy. Note that depending on your data center’s capability, you might need to reduce the number of DGXs hosted on the same rack.
Figure 4 DGX RUBIN NVL8 in Racks#
Figure 5 shows an example management rack configuration with networking switches, management servers, storage arrays, and UFM appliances. Sizes and quantities vary depending upon the models used. This example is for 1SU.
Figure 5 Management Rack Configuration with Networking Switches#
This reference architecture is focused on eight SUs with 576 DGX nodes. DGX SuperPOD can scale to much larger configurations up to and beyond 72 SU with more than 2000 DGX RUBIN NVL8 nodes. For more information, see Table 3.
SU Count |
Node Count |
GPU Count |
Cable Count |
|||
|---|---|---|---|---|---|---|
Leaf |
Spine |
Node-Leaf |
Leaf-Spine |
|||
1 |
72 |
576 |
8 |
4 |
576 |
576 |
2 |
144 |
1152 |
16 |
8 |
1152 |
1152 |
4 |
288 |
2304 |
32 |
18 |
2304 |
2304 |
8 |
576 |
4608 |
64 |
36 |
4608 |
4608 |
16 |
1152 |
9216 |
128 |
64 |
9216 |
9216 |
Contact NVIDIA for information regarding DGX SuperPOD solutions for 18 SUs or more.