System Overview#

NVIDIA DGX Spark offers a robust and versatile hardware and software environment tailored for advanced AI applications.

Hardware#

This chapter provides an overview of the key hardware differences between systems based on x86_64 CPUs with discrete GPUs (dGPUs) and our ARM-based System-on-Chip (SoC) platform.

CPU Architecture#

In comparison to the classical x86_64 processor in combination with a dedicated GPU, our ARM SoC combines CPU, GPU, and other accelerators on one chip. The CPU is a 20-core ARM64-based chip, sharing 128GB of LPDDR5x memory with the integrated GPU (iGPU). Our ARM SoC features a hybrid architecture consisting of 2 clusters.

Each cluster has 5 high-performance cores (ARM v9.2, Cortex-X925) with 2MB L2 cache each, and 5 high-efficiency cores (ARM v9.2, Cortex-A725) with 512KB L2 cache each. The high-performance cluster features 16MB L3 cache, while the high-efficiency cluster is equipped with an 8MB L3 cache.

This configuration allows for dynamic workload management, optimizing power consumption and thermal performance, and enables workload-specific optimizations that balance performance and efficiency.

Memory Model Differences

Another critical difference lies in the memory models. The x86_64 architecture is known for its strict memory consistency, ensuring that memory operations are highly predictable and synchronized across cores. This rigidity can simplify programming but may introduce performance bottlenecks due to the need for frequent synchronization. Conversely, ARM architectures offer a more relaxed memory model, which can lead to synchronization challenges if not managed properly. This flexibility allows for performance optimizations, particularly in scenarios where memory operations can be reordered for efficiency.

For more details see ARM Memory Ordering.

GPU#

The GPU integration differentiates our ARM-based System-on-Chip (SoC) platform from traditional x86_64 systems with discrete GPUs.

Our ARM SoC integrates a NVIDIA Blackwell Architecture GPU, which shares the 128GB of LPDDR5x memory with the CPU. Unlike other vendors, which use a fixed carve-out for the iGPU in the shared memory, we utilize a dynamic unified memory architecture (UMA), allowing both the CPU and iGPU to access the same memory space. This integration enables efficient data sharing by eliminating the need for data copies between CPU RAM and VRAM. Integrated GPUs, such as those in the DGX Spark, are optimized for lower power consumption, making them ideal for mobile and embedded applications.

GPU Specification

The integrated Blackwell GPU on our ARM SoC includes 5th generation Tensor Cores, 4th generation RT Cores, 1x NVENC and 1x NVDEC.

System Architecture (Memory and Buses)#

Another key difference between x86_64 + dGPU systems and our ARM SoC is the memory and bus architecture.

Memory Hierarchy

In traditional x86_64 systems, memory is distinctly separated between the CPU and the GPU. The CPU accesses system RAM, while the GPU utilizes dedicated video RAM (VRAM). This separation allows each component to optimize its memory usage for specific tasks, but it also introduces the overhead of transferring data between CPU and GPU memory.

ARM-based SoCs, including the DGX Spark platform, employ a Unified Memory Architecture (UMA). In UMA, both CPU and GPU share the same physical memory space without a fixed carve-out. This design allows for data sharing between CPU and iGPU, reducing latency and eliminating the need for redundant data copies. The shared memory model enhances performance in workloads that require frequent CPU-GPU collaboration, as data can be accessed seamlessly by both processing units.

Bus Architecture

ARM SoCs like the DGX Spark benefit from an integrated bus architecture with a 256-bit memory interface and 273GB/s bandwidth, facilitating efficient communication between CPU, GPU, and other components. This contrasts with the often more complex and power-intensive bus systems in x86_64 architectures, which must manage separate CPU and GPU components.

Peripherals, Networking, Connectivity#

The DGX Spark platform supports a range of peripherals and networking options. It includes 1x RJ-45 connector with 10 GbE Ethernet, a ConnectX-7 Smart NIC, WiFi 7, and Bluetooth 5.3 with LE. Additionally, it features 4x USB Type-C ports and 1x HDMI 2.1a display connector.

Software#

DGX OS is based on Ubuntu 24.04 (LTS) with added NVIDIA drivers, libraries, frameworks and tools.

For more details see DGX Spark Software Stack.