Major Components#

Major components for the DGX SuperPOD configuration are listed in Table 7. These are representative of the configuration and must be finalized based on actual design.

Table 7 Major components of the 4 SU, 127-node DGX SuperPOD#

Count

Component

Recommended Model

Racks

38

Rack (Legrand)

NVIDPD13

Nodes

127

GPU nodes

DGX H100 system

4

UFM appliance

NVIDIA Unified Fabric Manager Appliance 3.1

5

Management servers

Intel based x86 2 × Socket, 24 core or greater, 384 GB RAM, OS (2x480GB M.2 or SATA/SAS SSD in RAID 1), NVME 7.68 TB (raw), 4x HDR200 VPI Ports, TPM 2.0

Ethernet Network

8

In-band management

NVIDIA SN4600C switch with Cumulus Linux

8

OOB management

NVIDIA SN2201 switch with Cumulus Linux

Compute InfiniBand Fabric

48

Fabric switches

NVIDIA Quantum QM9700 switch, 920-9B210-00FN-0M0

Storage InfiniBand Fabric

16

Fabric switches

NVIDIA Quantum QM9700 switch, 920-9B210-00FN-0M0

PDUs

96

Rack PDUs

Raritan PX3-5878I2R-P1Q2R1A15D5

12

Rack PDUs

Raritan PX3-5747V-V2

Associated cables and transceivers are listed in Table 8. All networking components are multi-mode fiber.

Table 8 Estimate of cables required for a 4 SU, 127-node DGX SuperPOD#

Count

Component

Connection

Recommended Mode¹

In-Band Ethernet Cables

254

100 Gbps

DGX H100 system

Varies

32

100 Gbps QSFP to QSFP AOC

Management nodes

Varies

6

100 Gbps

ISL Cables

Varies

Varies

Ethernet (perf varies)

Storage

Varies

Varies

Varies

Core DC

Varies

OOB Ethernet Cables

127

1 Gbps

DGX H100 systems

Cat5e

64

1 Gbps

InfiniBand Switches

Cat5e

11

1 Gbps

Management/UFM nodes

Cat5e

8

1 Gbps

In-band Ethernet switches

Cat5e

Varies

1 Gbps

Storage

Cat5e

108

1 Gbps

PDUs

Cat5e

16

100 Gbps

Two uplinks per OOB to in-band

Varies

Compute InfiniBand Cabling

2040

NDR Cables¹, 400 Gbps

DGX H100 systems to leaf, leaf to spine

980-9I57X-00N010

2

NDR Cables, 200 Gbps

UFM to leaf ports

980-9I111-00H010

1536

Switch OSFP Transceivers

Leaf and spine transceivers

980-9IA2O-00NS00

508

System OSFP Transceivers

Transceivers in the DGX H100 Systems

980-9I89P-00N000

4

UFM System Transceivers

UFM to leaf connections

980-9I89R-00NS00

Storage InfiniBand Cables¹ ²

494

NDR Cables, 400 Gbps

DGX H100 systems to leaf, leaf to spine

980-9I57X-00N010

48²

NDR Cables, 200 Gbps

Storage

980-9I111-00H010

4

UFM System Transceivers

UFM to leaf connections

980-9I51S-00NS00

369

Switch Transceivers

Leaf and spine transceivers

980-9I510-00NS00

254

DGX System Transceivers

QSFP112 transceivers

980-9I693-00NS00

2

NDR Cables, 200 Gbps

UFM to leaf ports

980-9I557-00N030

4

HDR 400 Gbps to 2x200 Gbps

Slurm management

980-9I117-00H030

Varies

Storage Cables, NDR200

Varies

980-9I117-00H030

¹. Part number will depend on exact cable lengths needed based on data center requirements. ². Count and cable type required depend on specific storage selected.