Components#
NVIDIA RTX PRO Servers#
The NVIDIA RTX PRO Server is a performance-optimized server configuration that provides enterprises with the ultimate universal data center platform to power AI factories and accelerate demanding enterprise AI, industrial AI, and visual computing workloads— from multimodal agentic AI and robotics simulation to design, scientific computing, graphics, and video.
Featuring up to eight NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, these servers extend the performance and energy efficiency of the NVIDIA Blackwell architecture to every enterprise. With the ability to accelerate a wide range of enterprise workloads, RTX PRO servers are ideal for enterprise data centers that require air-cooled, power-efficient platforms.
NVIDIA RTX PRO Servers are NVIDIA-Certified systems and are optimized for NVIDIA Spectrum-X Ethernet with Connect-X and BlueField SuperNICs to deliver performance at scale.
Figure 1: RTX PRO Server.
NVIDIA RTX PRO 6000 Blackwell Server Edition GPU#
The NVIDIA RTX PRO 6000 Blackwell Server Edition GPU (Figure 2) revolutionizes productivity with its cutting-edge Blackwell architecture, delivering robust AI performance, advanced interconnects, and enhanced energy efficiency. It is the ultimate powerhouse for next-generation AI applications and enterprise computing.
Figure 2. RTX PRO 6000 Blackwell Server Edition GPU.
Ideally suited for environments that demand cutting-edge AI performance, advanced data processing, and high-speed computing, the RTX PRO 6000 Blackwell Server Edition GPU is built on the Blackwell architecture, and delivers enhanced AI capabilities, faster data processing, and improved energy efficiency, making it the superior choice for AI-driven applications, visual computing, and complex data analysis.
Being a high-performance GPU designed for server environments, it features 96GB of GDDR7 memory per GPU with memory bandwidth reaching up to 1.6TB/s per GPU. When configured in an 8-GPU node, the RTX PRO 6000 offers 768GB of GDDR7 memory and achieves an aggregate memory bandwidth of up to 12.8TB/s, making it particularly suited for demanding applications requiring high memory and bandwidth capabilities.
NVIDIA-Certified Systems and 2-8-5-200 Configuration#
This NVIDIA Enterprise RA is based on a 2-8-5-200 infrastructure configuration (2 CPUs, 8 GPUs, 5 NICs at 200 Gbps each) with NVIDIA RTX PRO Servers, including NVIDIA BlueField-3 and NVIDIA Spectrum-X Ethernet networking.
NVIDIA RTX PRO Servers are NVIDIA-Certified Systems with flexibility for optimizing the configuration to match the cluster requirements. These RTX PRO Servers are available from NVIDIA global system partners in 2-GPU, 4-GPU and 8-GPU configurations.
However, Enterprise RA focuses on the 8-GPU configuration. An example of this system design for the RTX PRO Server is shown in Figure 3.
Figure 3. Example of a 2-8-5-200 system configuration as adopted in an NVIDIA-Certified RTX PRO Server.
Elements for an NVIDIA-Certified RTX PRO Server are listed in Table 1
Table 1. RTX PRO Server system components
Parameter |
System Configuration |
|---|---|
Target workloads |
Small and medium LM model inference, fine- tuning, image and video generative AI, traditional DL inference including recommender systems and computer vision, physical and industrial AI, Single precision (FP32 and below) High Performance Compute (HPC), 3D graphics, rendering, and video workloads, virtual workstations, and ETL data processing. |
GPU configuration |
8 GPUs are balanced across CPU sockets and root ports. See the topology diagrams for details. |
CPU |
Intel Emerald Rapids, Intel Sapphire Rapids, Intel Granite Rapids, Intel Sierra Forest AMD Genoa, AMD Turin |
CPU sockets |
Single CPU socket minimum |
CPU speed |
2.0 GHz minimum CPU clock |
CPU cores |
Minimum 7 physical CPU cores per GPU
|
System memory (total across all CPU sockets) |
Minimum 128 GB of system memory per GPU. System memory should be evenly distributed across all CPU sockets, with each memory channel populated with at least one DIMM per channel (1DPC). |
DPU |
One NVIDIA® BlueField®-3 DPU per server for N/S network |
PCI Express |
Minimum of one Gen5 x16 link per GPU for optimal performance |
PCIe topology |
Balanced PCIe topology with GPUs spread evenly across CPU sockets and PCIe root ports. NIC and NVMe drives should be under the same PCIe switch or PCIe root complex as the GPUs. Note that a PCIe switch may not be needed for low-cost inference servers; direct-attach to the CPU is best if possible. See the topology diagrams for details. |
PCIe switches |
Minimum Gen5 PCIe switches as needed (where additional link fanout is not required, direct attach is best). With Gen 6 PCIe switches, higher network bandwidth with up to 400Gbps per GPU can be supported for DL fine tuning, image and video generative AI that require higher East/West network bandwidth for best performance. |
Compute (E/W) NIC |
Four NVIDIA® BlueField®-3 SuperNICs per server Up to 400 Gbps For workloads such as DL fine tuning, image and video generative AI workloads 400Gbps per GPU network bandwidth is recommended for maximum performance. |
Local storage |
Local storage recommendations are as follows:
|
Remote systems management |
SMBPBI over SMBus (OOB) protocol to BMC MCTP over USB (OOB) protocol to BMC PLDM T5-enabled SPDM-enabled |
Security |
TPM 2.0 module (secure boot) |
Control Plane/Management Nodes#
The cluster design specified in this Enterprise RA can support up to eight control plane/management nodes. In addition to the standard compute nodes, control plane nodes are needed to run the software that manages the cluster and provides access for users. The specific components depend on the management software. Table 2 shows an example configuration for these nodes.
For example, a configuration using NVIDIA Base Command Manager, Slurm, and Kubernetes together can include seven control plane nodes in total: two for Base Command Manager (with high availability configured), two for Slurm head nodes, and three for Kubernetes control plane nodes. This uses seven of the eight available control plane nodes possible.
Table 2: Control plane node components
Component |
Quantity |
Description |
|---|---|---|
CPU |
2 |
|
North/South (DPU) |
1 |
NVIDIA BlueField-3 B3220 DPU with two 200G ports and 1Gb RJ45 management port Other variants can be supported as per Compute Node alternatives |
System Memory |
– |
Minimum of 256 GB DDR5 |
Boot Drive |
1 |
1 TB NVMe SSD |
Local storage |
1 |
4 TB NVMe SSD. More may be required if image storage is required |
BMC |
1 |
1 Gb RJ45 management port |
Networking#
The NVIDIA networking configuration provides a framework for achieving high AI performance and scale while supporting cloud manageability and security. Drawing on NVIDIA’s expertise in AI cloud data centers, this approach helps organizations optimize network traffic flow:
East/West (Compute Network) traffic: Traffic between RTX PRO Servers within the cluster, commonly used for multi-node AI workloads, HPC collective operations, and similar workloads.
North/South (Customer and Storage Network) traffic: Traffic between RTX PRO Servers and external resources, including cloud management and orchestration systems, remote data storage nodes, and other data center or internet endpoints.
When combined with NVIDIA Spectrum-X Ethernet, RTX PRO Servers deliver strong performance for and inference, data science, scientific simulation, and other modern workloads. The following sections describe recommended NVIDIA system platform configurations with their associated NVIDIA networking platforms.
Compute (Node East/West) Ethernet Networking#
To deliver the highest Al performance, NVIDIA recommends the NVIDIA BlueField-3 (BF-3) SuperNIC smart network adapters.
The BlueField-3 SuperNIC offers up to 400 Gb/s, low-latency network connectivity between GPUs in the Al cluster, featuring RDMA and RoCE acceleration, with NVIDIA® GPUDirect® and GPUDirect Storage technologies. For data centers that deploy Ethernet, BlueField-3 offers a range of advancements including RoCE optimizations.
BlueField-3 is a central part of the NVIDIA Spectrum-X networking platform, which also features NVIDIA Spectrum-4 switches. At its core, the BlueField-3 SuperNIC emerges as a novel network accelerator, purpose-built to supercharge hyperscale Al workloads. It ensures lightning-fast and efficient communication between GPU servers, for enhanced Al workload efficiency. With its power-efficient HHHL PCIe design, the BlueField-3 SuperNIC seamlessly integrates into RTX PRO Servers, significantly boosting performance for Al traffic on the East/West network inside the clusters.
Multi-node deployments with a RTX PRO Server platform should adhere to the total compute network bandwidth per GPU recommendations described below.
Total Minimum Compute Network Bandwidth
200 GB/s (4x 200 Gb/s or 2x 400 Gb/s NICs)
Total Recommended Compute Network Bandwidth
400 GB/s (4x 400 Gb/s NICs)
The BlueField-3 SuperNIC supports two operation modes (NIC and DPU modes), with NIC mode as the default. If the SuperNIC is set to DPU mode, the DPU 1 GbE out-of-band management port must be connected.
The NVIDIA multi-node software stack deployment is optimized for several GPU to NIC ratios. Partners are recommended to accommodate NICs for the maximum 2:1 GPU to NIC ratio.
For the NICs used in NVIDIA RTX PRO server platforms, NVIDIA recommends the following NVIDIA products for the Compute (East/West) Networking
Table 3: 2-8-5-200 recommended SuperNICs for East/West Network
Product |
PCIe Card Form Factor |
Connector Required |
Applicable Topologies |
|---|---|---|---|
NVIDIA BlueField-3 B3140H E-series HHHL DPU, 400GbE (default mode) /NDR IB, Single-port QSFP112 |
HHHL PCIe Gen5 x16 |
75W system power supply through the PCIe x16 interface 1G/USB Management Interfaces QSFP112 connector cages |
Switch in Virtual or Synthetic Mode / Switch in Base Mode |
NVIDIA BlueField-3 B3220L E-Series FHHL DPU, 200GbE (default mode) /NDR200 IB, Dual-port QSFP112 |
FHHL PCIe Gen5 x16 |
75W system power supply through the PCIe x16 interface 1G/USB Management Interfaces QSFP112 connector cages |
Switch in Virtual or Synthetic Mode / Switch inBase Mode |
Converged (Node North/South) Ethernet Networking#
This section describes the BlueField-3 role for the North/South (N/S) Ethernet network in the NVIDIA RTX PRO server platform based on the 2-8-5-200 Enterprise RA configuration and the recommended BlueField-3 models for this infrastructure.
The NVIDIA BlueField-3 data processing unit (DPU) is a 400 Gb/s infrastructure compute platform that enables organizations to securely deploy and operate NVIDIA AI data centers at massive scales. BlueField-3 DPU is optimized for the N/S network and the BlueField-3 SuperNIC is optimized for the E/W, Al compute fabric.
BlueField-3 offers several essential capabilities and benefits within the NVIDIA RTX PRO server platform:
Workload Orchestration: BlueField-3 serves as an optimized compute platform for the data center control-plane, enabling automated provisioning and elasticity. This empowers NVIDIA accelerated compute platforms to scale resources dynamically based on fluctuating demand, ensuring efficient allocation of computing resources for transient Al workloads.
Storage Acceleration: BlueField-3 provides advanced storage acceleration features that optimize data storage access. Its innovative BlueField SNAP technology enables remote storage devices to function as local, improving Al performance
Secure Infrastructure: BlueField-3 operates in a highly secure zero-trust mode and functions independently from the host, significantly enhancing the security of the NVIDIA RTX PRO platform. In addition, BlueField-3 enables a wide range of accelerated security services, including next-generation firewall and micro-segmentation, bolstering the overall security posture of the infrastructure.
The BlueField-3 DPU integration enhances the NVIDIA RTX PRO Server platform’s resource management, performance, security, and scalability, making it well-suited for AI workloads in datacenter environments.
For example, the BlueField-3 DPU can operate in Embedded Function (ECPF) mode, where the DPU’s Arm subsystem manages network resources. In this configuration, network traffic passes through a virtual switch on the DPU before reaching the host server, providing an additional layer of control.
The BlueField DPU includes an onboard Baseboard Management Controller (BMC) that simplifies platform management. This BMC allows you to provision and manage both the BlueField DPU and the NVIDIA RTX PRO Server using standard industry tools like Redfish APIs. It features an external Root-of-Trust for firmware security and connects to your datacenter management network through a 1GbE out-of-band management port.
When planning your deployment, consider that some BlueField-3 configurations for north-south network connections may require more than 75W of power. These cards need both PCIe slot power and an external PCIe power connector (rated for at least 75W) to operate properly.
For the BlueField-3 DPUs used in the N/S network for NVIDIA RTX PRO system platforms, NVIDIA recommends the following BlueField-3 products listed in Table 4:
Table 4: RTX PRO Server recommended BlueField-3 DPUs for North/South Network
Product |
PCIe Card Form Factor |
Connector Required |
Applicable Topologies |
|---|---|---|---|
NVIDIA BlueField-3 B3220 P-series FHHL DPU, 200GbE (default mode) /NDR200 IB, Dual-port QSFP112 |
FHHL PCIe Gen5 x16 with x16 PCIe extension option |
1G/USB Management Interfaces QSFP112 connector cages |
All |
NVIDIA BlueField-3 B3240 P-Series FHHL DPU, 400GbE (default mode) /NDR IB, Dual-port QSFP112 |
FHHL PCIe Gen5 x16 with x16 PCIe extension option |
1G/USB Management Interfaces QSFP112 connector cages |
All |