DGX Spark Software Stack#

This document outlines the software specifications for the NVIDIA DGX Spark product, providing an overview of its operating system and core software stack. The DGX Spark is designed for software developers and AI enthusiasts looking to leverage the DGX ecosystem for local AI development.

Overview#

The DGX Spark offers a robust and versatile software environment tailored for advanced AI applications. It features a base operating system derived from Ubuntu 24.04 and integrates the NVIDIA AI stack, providing access to essential tools and libraries for AI and machine learning workflows.

System Architecture#

Operating System#

  • Base OS: Ubuntu 24.04 server image with desktop packages

  • Kernel: Linux v6.11 based NVIDIA Base OS with necessary patches

Boot and Hardware Enablement#

Boot Configuration

  • Boot Mode: UEFI (default), with USB-based boot support

  • Initial Setup: Configurable system settings on first boot:

    • Timezone, language, keyboard layout

    • Username, password, and hostname

Operational Modes

  • Desktop Mode: Standard operation with display, keyboard, and mouse

  • Headless Mode: Network-accessible via SSH, webserver

Firmware and Updates

  • BSP Firmware: NVIDIA-supported with independent OS updates

  • Update Methods: Secure UEFI capsule updates and LVFS (Linux Vendor Firmware Service)

  • System Updates: Repository-based over-the-air (OTA) updates

Hardware Support

  • Storage: Single internal NVMe SSD (M.2 form factor)

    • Capacity: 1TB-4TB

    • Features: SED-based hardware encryption

  • System Memory: 128 GB unified memory

  • GPU Driver: Latest iGPU driver optimized for AI stack

    • Launch Driver: R580.GA UDA driver

  • USB Support: USB 3.2 driver support for:

    • Baseline devices

    • HID devices

    • Webcams

Networking

  • Ethernet: Full support with MLNX_OFED drivers for CX7 (Connect7 PCIe-based smart NIC)

  • Wireless: WiFi and Bluetooth drivers included

  • Bluetooth Profiles: BT HID profiles at launch, broader profiles (BT Audio, BLE) post-launch

System Recovery

  • Re-imaging: USB-based system recovery

  • Image Sources: Canonical/NVIDIA repositories

Security

  • Boot Security: dTPM (Discrete Trusted Platform Module)

    • Default: Off for first-party DGX Spark systems

    • Configurable: Can be enabled by enterprises/OEMs with signed driver/kernel

Display and Desktop Interface#

Display Capabilities#

Video Outputs

  • HDMI: 1x HDMI 4K (up to 120Hz)

  • DisplayPort: 2x DP (Alt mode) 4K (up to 120Hz)

Audio Support

  • USB Audio

  • Bluetooth Audio

Desktop Experience#

  • Interface: Regular Ubuntu desktop with NVIDIA branding

  • Pre-installed: Default icons for NVIDIA software, documentation, and how-to videos

  • Graphics: Ubuntu (Wayland) GUI desktop with preinstalled browser

  • Acceleration: Desktop and application acceleration using OpenGL/Vulkan

  • Video: Desktop video acceleration (nvenc/nvdec) for browsers and media players (VLC)

DRM Content Support

  • Browser playback in fallback resolutions

  • Enhanced copy protection (planned post-launch)

Performance and Power Management#

  • RTD3: Runtime D3 support

  • Power States: Product-defined PStates for optimized performance

  • Suspend/Resume: Basic functionality support

Software Stack#

Core AI Libraries#

NVIDIA AI Stack

  • NCCL (NVIDIA Collective Communications Library)

  • cuDNN (CUDA Deep Neural Network library)

  • TensorRT-LLM

  • TensorRT

  • All supported toolkits and math libraries

CUDA Toolkit

  • CUDA 13.0

  • Latest fully-tested CUDA Toolkit, with CUDA examples included

Development Tools#

Linux Development Tools

  • build-essentials

  • gdb, vim

  • Support for C, C++, Perl, Python development

GPU Development Tools

  • Nsight Systems

  • Nsight Compute

  • Nsight Graphics

  • Nsight Deep Learning Designer

  • JupyterLab extensions

  • CUDA GDB

Container and Orchestration#

Docker Support

  • NVIDIA Docker containers

  • NVIDIA Container Runtime for Docker included

  • Multiple bare metal container support

Kubernetes

  • Single and stacked device support

Data Science and Analytics#

Python Data Stack (PyData)

  • cuDF

  • cuML

  • cuGraph

  • XGBoost

Apache Spark / Spark RAPIDS

  • Enterprise data science support

  • RAPIDS OSS project support for CUDA 13.0

Deep Learning Frameworks

  • All Blackwell-optimized frameworks provided by NVIDIA

Compute Support

  • OpenCL support included

Additional Software Support#

  • Jetson SW: Jetson software services on SBSA CUDA

  • Omniverse: NVIDIA Omniverse support

  • GPU Driver: GSP-RM/OpenRM kernel module (default)

  • Telemetry: Device activation and census support

System Management#

Monitoring and Diagnostics#

System Monitoring

  • nvidia-smi for basic system health monitoring

  • GPU and CPU telemetry via system monitoring agents

  • MiTelemetry support for system controllers

Hardware Diagnostics

  • Hardware error recording (FDR) for error history

  • Field diagnostics software for RMA flow management

  • Manufacturing diagnostics with external SOC support for MODS

  • CPU and GPU testing capabilities

Remote Management

  • Serial console support for flashing and remote management

Security Features#

Secure Boot and TPM

  • Secure boot and TPM (Trusted Platform Module) support

  • Default: Off for TTM systems

  • Configurable: Can be enabled by enterprises/OEMs with signed driver/kernel

Firmware Security

  • Signing infrastructure for all firmware/BSP components

  • Secure firmware updates via EC (Embedded Controller) and UEFI

Performance Specifications#

Chip Features#

CPU Configuration

  • Architecture: 10P+1OE

  • All-cores FMax:

    • PCores: 4.075GHz

    • ECores: 2.8GHz (50% Bin)

  • Turbo/Single-core FMax:

    • PCores: 4.175GHz

    • ECores: N/A (50% Bin)

  • PCore VMax: 1.2V

Feature Support Matrix

Feature Support Matrix#

Feature

Status

ISP

NO

DLA

NO

Audio/Audio DSP

No external Codec

OSROOT/FTPM

No (External TPM)

Sensor Hub

No

dGPU attach

No

10s

No

DP over USB4

Enabled post-launch

CSI

NO

eDP

NO

Soundwire

NO

Reliability Specifications#

Yield and Lifetime

  • Yield/Bin Size: 50%/Typical (No corner part characterization needed)

  • Lifetime at Vmax: 25% of 5 Years at 105°C (~70°C ambient) max perf state (180 hrs/pm, TJMAX)

  • Lifetime at Nominal Voltage: 75% of 5 Years at 55°C TJMAX (35°C ambient) (540 hrs/pm at Vmin or suspend)

Reliability Metrics

  • EM (Electromigration): 1000/10yr (Commercial segment)

  • Design DPPM (intrinsic): AGING: 500 dppm total (with margins)

  • Design Target: Median 0.75yr at Vmax/105°C

  • TDDB (Time Dependent Dielectric Breakdown): 500-600