Overview#
This NVIDIA Enterprise Reference Architecture (Enterprise RA) is a practical design guide for an NVIDIA RTX PRO AI Factory. It is based on the 2-8-5-200 infrastructure configuration (2 CPUs, 8 GPUs, 5 NICs at 200 Gbps each) with NVIDIA RTX PRO Servers, including NVIDIA BlueField-3 and NVIDIA Spectrum-X Ethernet networking. It provides a modular architecture based on NVIDIA-Certified Systems, with each NVIDIA-Certified RTX PRO Server equipped with eight RTX PRO 6000 Blackwell Server Edition GPUs. The architecture is organized around Scalable Units (SUs), each composed of four servers, with options to expand further based on specific requirements. The flexible rail optimized end of row network architecture allows organizations to adapt the rack layout and adjust the number of servers per rack to fit their data center environment.
This document outlines the hardware components that define this scalable and modular architecture. Specific guidance is provided for implementations with 16 RTX PRO Servers (nodes) and 128 GPUs; or with 32 RTX PRO Servers (nodes) with 256 GPUs. Hardware support is available through global systems partners who offer complete solutions based on NVIDIA Enterprise RAs.
Use Cases#
This architecture is well suited for organizations implementing the following use cases:
Agentic AI Inference – Efficiently supports inference for small and medium model sizes, powering test-time compute for reasoning AI and the range of agentic applications it enables.
Industrial and Physical AI – Speeds up development and testing for physical AI including large-scale twin modelling, video analytics and summarization, and synthetic data generation for robotics and autonomous systems.
Visual Computing – Supports high-speed rendering, ray tracing and advanced video processing for video, design and media workflows.
Data Analytics & Scientific Simulation – Accelerates large-scale simulations, data analytics and predictive modelling for problems requiring precision FP32 and below.
Our Sizing Guides provide guidance on typical characteristics and models supported for various workloads.
For all use cases, this architecture is ideal for multi-user, single tenant workloads. Specifically, the logical design and software is streamlined for deployment and maintenance ease by tailoring the configuration to one where users are all part of the same enterprise, and accounting and access control can be consolidated.
Similarly, Kubernetes is the modern foundation of enterprise AI work, and this Enterprise RA is architected for deploying Kubernetes and Kubernetes-dependent applications and tooling.