Major Components
Major components for the DGX SuperPOD configuration are listed in Table 7. These are representative of the configuration and must be finalized based on actual design.
Table 7. Major components of the 4 SU, 127-node DGX SuperPOD
Count |
Component |
Recommended Model |
---|---|---|
Racks |
||
38 |
Rack (Legrand) |
NVIDPD13 |
Nodes |
||
127 |
GPU nodes |
DGX B200 system |
4 |
UFM appliance |
NVIDIA Unified Fabric Manager Appliance 3.1 |
5 |
Management servers |
Intel based x86 2 × Socket, 24 core or greater, 384 GB RAM, OS (2x480GB M.2 or SATA/SAS SSD in RAID 1), NVME 7.68 TB (raw), 4x HDR200 VPI Ports, TPM 2.0 |
Ethernet Network |
||
8 |
In-band management |
NVIDIA SN4600C switch with Cumulus Linux, 64 QSFP28 ports, P2C 920-9N302-00F7-0C |
8 |
OOB management |
NVIDIA SN2201 switch with Cumulus Linux, 48 RJ45 ports, P2C, 920 9N110-00F1-0C0 |
Compute InfiniBand Fabric |
||
48 |
Fabric switches |
NVIDIA Quantum QM9700 switch, 920-9B210-00FN-0M0 |
Storage InfiniBand Fabric |
||
16 |
Fabric switches |
NVIDIA Quantum QM9700 switch, 920-9B210-00FN-0M0 |
PDUs |
||
96 |
Rack PDUs |
Raritan PX3-5878I2R-P1Q2R1A15D5 |
12 |
Rack PDUs |
Raritan PX3-5747V-V2 |
Associated cables and transceivers are listed in Table 8. All networking components are multi-mode fiber.
Table 8. Estimate of cables required for a 4 SU, 127-node DGX SuperPOD
Count |
Component |
Connection |
Recommended Mode |
---|---|---|---|
In-Band Ethernet Cables |
|||
254 |
200 Gbps QSFP56 to QSFP56 AO |
DGX B200 system |
980-9I440-00H030 |
8 |
100 Gbps QSFP28 to QSFP28 AOC |
Management nodes |
980-9I13N-00C030 |
4 |
100 Gbps QSFP28 CWDM4 Single mode 2km Transceiver |
Uplink to core DC |
980-9I17Q-00CM00 |
6 |
100 Gbps QSFP-QSFP DAC Passive Copper cable |
ISL Cables |
980-9I620-00C00 |
8 |
100 Gbps QSFP28 to QSFP28 AOC |
NFS Storage |
980-9I13N-00C03 |
24 |
100 Gbps QSFP28 to QSFP28 AOC |
Leaf – Core cables |
980-9I13N-00C03 |
OOB Ethernet Cables |
|||
127 |
1 Gbps |
DGX B200 systems |
Cat5e |
64 |
1 Gbps |
InfiniBand Switches |
Cat5e |
8 |
1 Gbps |
Management/UFM nodes |
Cat5e |
8 |
1 Gbps |
In-band Ethernet switches |
Cat5e |
2 |
1 Gbps |
UFM Back-to-Back |
Cat5e |
108 |
1 Gbps |
PDUs |
Cat5e |
4 |
QSFP to SFP+ Adapter |
For the UFM connections |
980-9I71G-00J000 |
4 |
Ethernet Module SFP BaseT 1G |
For the UFM connections |
980-9I251-00IS00 |
16 |
100 Gbps AOC QSFP28 to QSFP28 Cable |
Two uplinks per OOB to in-band |
980-9I13N-00C030 |
Varies |
1 Gbps |
Storage |
Cat5e |
Compute InfiniBand Cabling |
|||
2044 |
NDR Cables¹, 400 Gbps |
DGX B200 systems to leaf, leaf to spine, UFM to leaf ports |
980-9I570-00N030 |
1536 |
Switch 2x400G OSFP Finned- top Multimode Transceivers |
Leaf and spine transceivers |
980-9I510-00NS0 |
508 |
System 2x400G OSFP Flat-top Multimode Transceivers |
Transceivers in the DGX B200 Systems |
980-9I51A-00NS00 |
4 |
UFM System 400G OSFP Multimode Transceivers |
UFM to leaf connections |
980-9I51S-00NS00 |
Storage InfiniBand Cables¹ ² |
|||
498 |
NDR Cables, 400 Gbps |
DGX B200 systems to leaf, leaf to spine UFM to leaf connections |
980-9I570-00N030 |
48 |
NDR AOC Cables, 2x 200 Gbps QSFP56-QSFP56 |
Storage |
980-9I117-00H030 |
4 |
UFM System 400G OSFP Multimode Transceivers |
UFM to leaf connections |
980-9I51S-00NS00 |
369 |
Switch 2x400G OSFP Finned- top Multimode Transceivers |
Leaf and spine transceivers |
980-9I510-00NS0 |
254 |
DGX System 400G QSFP112 Multimode Transceivers |
QSFP112 transceivers |
980-9I693-00NS00 |
4 |
HDR 400 Gbps to 2x200 Gbps AOC Cables |
Slurm management |
980-9I117-00H030 |
Varies |
Storage Cables, 400 Gbps to 2x200 Gbps AOC Cables |
Varies |
980-9I117-00H030 |
¹. Part number will depend on exact cable lengths needed based on data center requirements. ². Count and cable type required depend on specific storage selected. |