DGX SuperPOD Architecture#

The DGX SuperPOD architecture is a combination of DGX systems, Ethernet networking, InfiniBand Networking, management nodes, and storage. Figure 2 the rack layout of a single SU. With DGX SuperPOD with DGX B300 systems, we utilize standard racks and with traditional power supplies and PDUs.

In our reference design, four DGX B300 fit within a single rack. The rack-level power consumption per rack is ~56kW. The rack layout can be adjusted to meet local data center requirements, such as maximum power per rack and rack layout between DGX systems and supporting equipment to meet local needs for power and cooling distribution.

This Reference Architecture is focused on traditional PDU and AC powered EIA racks. DGX SuperPOD with DGX B300 systems is also available for more dense DB Busbar solutions as well.

Figure 2 shows 72 x NVIDIA DGX B300 PS systems in standard racks each with three (3) 2U rack PDUs for maximum redundancy. Note that depending on your data center’s capability, you might need to reduce the number of DGXs hosted on the same rack.

_images/image4.jpeg

Figure 2 DGX B300 in Racks#

Figure 3 shows an example management rack configuration with networking switches, management servers, storage arrays, and UFM appliances. Sizes and quantities will vary depending upon models used. This example is for 1SU.

_images/image5.png

Figure 3 Management Equipment in Rack#

This reference architecture is focused on 8 SU units with 576 DGX nodes. DGX SuperPOD can scale to much larger configurations up to and beyond 72 SU with 2000+ DGX B300 nodes. See Table 3 for more information.

Table 3 Example Compute Fabric Components for DGX SuperPOD Counts#

SU Count

Node Count

GPU Count

Cable Count

Leaf

Spine

Node-Leaf

Leaf-Spine

1

72

576

8

4

576

576

2

144

1152

16

8

1152

1152

4

288

2304

32

18

2304

2304

8

576

4608

64

36

4608

4608

18

1296

9216

144

72

10368

10368

Contact NVIDIA for information regarding DGX SuperPOD solutions of four scalable units or more.