NVIDIA Documentation Center
Welcome to the NVIDIA Documentation Center where you can explore the latest
technical information and product documentation.
Cloudera Data Platform (CDP) Documentation
The integration of NVIDIA RAPIDS into the Cloudera Data Platform (CDP) provides transparent GPU acceleration of data analytics workloads using Apache Spark. This documentation describes the integration and suggested reference architectures for deployment.
Data Center Documentation
Documentation for managing and running containerized GPU applications in the data center using Kubernetes, Docker, and LXC.
NVIDIA Cloud-Native Technologies
NVIDIA cloud-native technologies enable developers to build and run GPU-accelerated containers using Docker and Kubernetes.Browse
NVIDIA Data Center GPU Drivers
NVIDIA Data Center GPU drivers are used in Data Center GPU enterprise deployments for AI, HPC, and accelerated computing workloads. Documentation includes release notes, supported platforms, and cluster setup and deployment.Browse
NVIDIA Data Center GPU Manager (DCGM) is a suite of tools for managing and monitoring NVIDIA Data Center GPUs in cluster environments.Browse
NVIDIA System Management
NVIDIA System Management is a software framework for monitoring server nodes, such as NVIDIA DGX servers, in a data center.Browse
Deep Learning Performance Documentation
GPUs accelerate machine learning operations by performing calculations in parallel. Many operations, especially those representable as matrix multipliers will see good acceleration right out of the box. Even better performance can be achieved by tweaking operation parameters to efficiently use GPU resources. The performance documents present the tips that we think are most widely useful.
GPU Management and Deployment Documentation
This documentation should be of interest to cluster admins and support personnel of enterprise GPU deployments. It includes monitoring and management tools and application programming interfaces (APIs), in-field diagnostics and health monitoring, and cluster setup and deployment.
Documentation for InfiniBand and Ethernet networking solutions to achieve faster results and insight by accelerating HPC, AI, Big Data, Cloud, and Enterprise workloads over NVIDIA Networking.
End-to-end networking solutions with smart adapters, switches, cables, and management software that reduce latency, increase efficiency, enhance security, and simplify data center automation so applications run faster.Browse
Networking Ethernet Software
Ethernet networking software documentation for NVIDIA Cumulus Linux, NVIDIA NetQ, and NVIDIA Cumulus VX networking solutions.Browse
NVIDIA AI Enterprise Documentation
NVIDIA AI Enterprise is an end-to-end, cloud-native suite of AI and data analytics software, optimized, certified and supported by NVIDIA to run on VMware vSphere with NVIDIA-Certified Systems.
NVIDIA Base Command Platform Documentation
NVIDIA Base Command Platform is a world-class infrastructure solution for businesses and their data scientists who need a premium AI development experience.
NVIDIA Bright Cluster Manager Documentation
NVIDIA Bright Cluster Manager offers fast deployment and end-to-end management for heterogeneous HPC and AI server clusters at the edge, in the data center and in multi/hybrid-cloud environments. It automates provisioning and administration for clusters ranging in size from a single node to hundreds of thousands, supports CPU-based and NVIDIA GPU-accelerated systems, and orchestration with Kubernetes.
NVIDIA Clara Documentation
NVIDIA Clara is an open, scalable computing platform that enables developers to build and deploy medical imaging applications into hybrid (embedded, on-premises, or cloud) computing environments to create intelligent instruments and automate healthcare workflows.
NVIDIA Clara Holoscan
NVIDIA Clara Holoscan is a hybrid computing platform for medical devices that combines hardware systems for low-latency sensor and network connectivity, optimized libraries for data processing and AI, and core microservices to run surgical video, ultrasound, medical imaging, and other applications anywhere, from embedded to edge to cloud.Browse
NVIDIA Clara Parabricks Pipelines
Clara Parabricks is a complete software solution for next-generation sequencing, including short- and long-read applications, supporting workflows that start with basecalling and extend through tertiary analysis. Clara Parabricks Pipelines were built to optimize acceleration, accuracy, and scalability. Users can achieve a 35-50X acceleration and 99.99 percent accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines. It can run the full GATK4 Best Practices and is also fully configurable, letting users choose which steps, parameter settings, and versions of the pipeline to run.Browse
NVIDIA Clara Parabricks Toolkit
The Clara Parabricks Toolkit is a technology stack of CUDA-accelerated libraries and deep learning modules, C++ and Python APIs, reference applications, and integrations with third-party applications and workflows for HPC, deep learning, and data analytics tools in genomics.Browse
NVIDIA Clara Train Application Framework
Clara Train Application Framework is a domain-optimized, developer application framework that includes APIs for AI-assisted annotation, making any medical viewer AI-capable. It also includes a TensorFlow-based training framework with pre-trained models to kickstart AI development with techniques like transfer learning, federated learning and AutoML.Browse
NVIDIA Clara Viz
NVIDIA Clara Viz is a platform for visualization of 2D/3D medical imaging data. The core of this platform is the Clara Viz SDK, which is designed to enable developers to incorporate high performance volumetric visualization of medical images in applications with an easy-to-use API.Browse
NVIDIA CloudXR SDK Documentation
CloudXR is NVIDIA's solution for streaming virtual reality (VR), augmented reality (AR), and mixed reality (MR) content from any OpenVR XR application on a remote server--desktop, cloud, data center, or edge.
NVIDIA CUDA Libraries Documentation
Documentation for CUDA Libraries, including cuBLAS, cuSOLVER, cuSPARSE, cuFFT, cuRAND, nvJPEG, and NPP.
NVIDIA cuBLAS Library
The cuBLAS library is an implementation of Basic Linear Algebra Subprograms (BLAS) on the NVIDIA CUDA runtime. It enables the user to access the computational resources of NVIDIA GPUs.Browse
NVIDIA cuFFT Library
The NVIDIA CUDA Fast Fourier Transform (cuFFT) library consists of two components: cuFFT and cuFFTW. The cuFFT library provides high performance on NVIDIA GPUs, and the cuFFTW library is a porting tool to use the Fastest Fourier Transform in the West (FFTW) on NVIDIA GPUs.Browse
NVIDIA cuFFTDx Library
The cuFFT Device Extensions (cuFFTDx) library enables you to perform Fast Fourier Transform (FFT) calculations inside your CUDA kernel. Fusing FFT with other operations can decrease the latency and improve the performance of your application.Browse
NVIDIA cuRAND Library
The NVIDIA CUDA Random Number Generation (cuRAND) library provides an API for simple and efficient generation of high-quality pseudorandom and quasirandom numbers.Browse
NVIDIA cuSOLVER Library
The cuSOLVER library is a high-level package based on cuBLAS and cuSPARSE libraries. It provides Linear Algebra Package (LAPACK)-like features such as common matrix factorization and triangular solve routines for dense matrices.Browse
NVIDIA cuSPARSE Library
The cuSPARSE library contains a set of basic linear algebra subroutines used for handling sparse matrices. It’s implemented on the NVIDIA CUDA runtime and is designed to be called from C and C++.Browse
NVIDIA cuSPARSELt Library
The cuSPARSELt library provides high-performance, structured, matrix-dense matrix multiplication functionality. cuSPARSELt allows users to exploit the computational resources of the latest NVIDIA GPUs.Browse
NVIDIA cuTENSOR Library
The cuTENSOR library is a first-of-its-kind, GPU-accelerated tensor linear algebra library, providing high-performance tensor contraction, reduction, and element-wise operations. cuTENSOR is used to accelerate applications in the areas of deep learning training and inference, computer vision, quantum chemistry, and computational physics.Browse
NVIDIA NPP Library
NVIDIA Performance Primitives (NPP) is a library of functions for performing CUDA-accelerated 2D image and signal processing. This library is widely applicable for developers in these areas and is written to maximize flexibility while maintaining high performance.Browse
The nvJPEG Library provides high-performance, GPU-accelerated JPEG encoding and decoding functionality. This library is intended for image formats commonly used in deep learning and hyperscale multimedia applications.Browse
The nvJPEG2000 library provides high-performance, GPU-accelerated JPEG2000 decoding functionality. This library is intended for JPEG2000 formatted images commonly used in deep learning, medical imaging, remote sensing, and digital cinema applications.Browse
NVIDIA CUDA Toolkit Documentation
The NVIDIA CUDA Toolkit provides a comprehensive development environment for C and C++ developers building GPU-accelerated applications.
Find archived online documentation for CUDA Toolkit.Browse
NVIDIA cuDNN Documentation
The NVIDIA CUDA Deep Neural Network (cuDNN) library is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration.
NVIDIA cuOpt Documentation
NVIDIA cuOpt is an Operations Research optimization API using AI to help developers create complex, real-time fleet routing workflows on NVIDIA GPUs.
NVIDIA DALI Documentation
The NVIDIA Data Loading Library (DALI) is a collection of highly optimized building blocks, and an execution engine, for accelerating the pre-processing of input data for deep learning applications. DALI provides both the performance and the flexibility for accelerating different data pipelines as a single library. This single library can then be easily integrated into different deep learning training and inference applications.
NVIDIA Data Science Workbench Documentation
NVIDIA Data Science Workbench is a productivity tool for GPU-enabled workstations to improve manageability, reproducibility, and usability for data scientists, data engineers, and AI developers. Users have fast and convenient access to a plethora of data science tools and CLIs while also benefiting from easy installation and updating stack software.
NVIDIA DeepStream SDK Documentation
The NVIDIA DeepStream SDK delivers a complete streaming analytics toolkit for situational awareness through computer vision, intelligent video analytics (IVA), and multi-sensor processing.
NVIDIA DGX is the integrated software and hardware system that supports the commitment to AI research with an optimized combination of compute power, software, and deep learning performance. It is purpose-built to meet the demands of enterprise AI and data science, delivering the fastest start in AI development, effortless productivity, and revolutionary performance—for insights in hours instead of months.
NVIDIA DGX Systems
DGX Systems provides integrated hardware, software, and tools for running GPU-accelerated applications such as deep learning, AI analytics, and interactive visualization.Browse
NVIDIA DGX Zone
The DGX Zone is for DGX users and Ops teams to find supplemental information and instructions for configuring and using DGX Systems. It includes topics beyond those covered in the User's Guides.Browse
NVIDIA DIGITS Documentation
The NVIDIA Deep Learning GPU Training System (DIGITS) can be used to rapidly train highly accurate deep neural networks (DNNs) for image classification, segmentation, and object-detection tasks. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best-performing model from the results browser for deployment.
NVIDIA DRIVE Platform Documentation
The NVIDIA DRIVE Platform provides a comprehensive software and hardware solution for the development of autonomous vehicles.
NVIDIA EGX Platform
The NVIDIA EGX platform delivers the power of accelerated AI computing to the edge with a cloud-native software stack (EGX stack), a range of validated servers and devices, Helm charts, and partners who offer EGX through their products and services.
NVIDIA Fleet Command
NVIDIA Fleet Command brings secure edge AI to enterprises of any size. Transform NVIDIA-certified servers into secure edge appliances and connect them to the cloud in minutes. From the cloud, deploy and manage applications from the NGC Catalog or your NGC Private Registry, update system software over-the-air and manage systems remotely with nothing but a browser and internet connection.
NVIDIA GameWorks Documentation
Documentation for GameWorks-related products and technologies, including libraries (NVAPI, OpenAutomate), code samples (DirectX, OpenGL), and developer tools (Nsight, NVIDIA System Profiler).
NVIDIA GPUDirect Storage (GDS) Documentation
NVIDIA GPUDirect Storage (GDS) enables the fastest data path between GPU memory and storage by avoiding copies to and from system memory, thereby increasing storage input/output (IO) bandwidth and decreasing latency and CPU utilization.
NVIDIA HPC SDK Documentation
The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries, and development tools used for developing HPC applications for the NVIDIA platform.
NVIDIA Isaac Documentation
NVIDIA Isaac is a developer toolbox for accelerating the development and deployment of AI-powered robots. The SDK includes Isaac applications, GEMs (robot capabilities), a Robot Engine, and NVIDIA Isaac Sim.
NVIDIA Jetson Software Documentation
The NVIDIA JetPack SDK, which is the most comprehensive solution for building AI applications, along with L4T and L4T Multimedia, provides the Linux kernel, bootloader, NVIDIA drivers, flashing utilities, sample filesystem, and more for the Jetson platform.
The JetPack SDK is the most comprehensive solution for building AI applications. The JetPack installer can be used to flash the Jetson Developer Kit with the latest OS image and to install developer tools, libraries and APIs, samples, and documentation.Browse
NVIDIA Jetson Linux supports development on the Jetson platform.Browse
The L4T APIs provide additional functionality to support application development. The APIs enable flexibility by providing better control over the underlying hardware blocks.Browse
This archives section provides access to previously released JetPack, L4T, and L4T Multimedia documentation versions.Browse
NVIDIA LaunchPad Documentation
With NVIDIA LaunchPad, enterprises can get immediate, short-term access to NVIDIA AI running on private accelerated compute infrastructure to power critical AI initiatives.
NVIDIA Maxine Documentation
NVIDIA Maxine is a GPU-accelerated SDK with state-of-the-art AI features for developers to build virtual collaboration and content creation applications such as video conferencing and live streaming. Maxine’s AI SDKs, such as Video Effects, Audio Effects, and Augmented Reality (AR) are highly optimized and include modular features that can be chained into end-to-end pipelines to deliver the highest performance possible on GPUs, both on PCs and in data centers.
NVIDIA Modulus Documentation
NVIDIA Modulus is a Physics-Informed Neural Networks (PINNs) toolkit that enables you to get started with AI-driven physics simulations and leverage a powerful framework to implement your domain knowledge to solve complex nonlinear physics problems with real-world applications.
NVIDIA Morpheus Documentation
NVIDIA Morpheus is an open AI application framework that provides cybersecurity developers with a highly optimized AI pipeline and pre-trained AI capabilities and allows them to instantaneously inspect all IP traffic across their data center fabric.
NVIDIA NCCL Documentation
The NVIDIA Collective Communications Library (NCCL) is a library of multi-GPU collective communication primitives that are topology-aware and can be easily integrated into applications. Collective communication algorithms employ many processors working in concert to aggregate data. NCCL is not a full-blown parallel programming framework; rather, it’s a library focused on accelerating collective communication primitives.
NVIDIA NeMo Documentation
NVIDIA Neural Modules (NeMo) is a flexible, Python-based toolkit enabling data scientists and researchers to build state-of-the-art speech and language deep learning models composed of reusable building blocks that can be safely connected together for conversational AI applications.
NVIDIA NGC Documentation
NVIDIA NGC is the hub for GPU-optimized software for deep learning, machine learning, and HPC that provides containers, models, model scripts, and industry solutions so data scientists, developers and researchers can focus on building solutions and gathering insights faster.
A platform to accelerate AI, HPC and Visualization GPU workflows and thus accelerate time to solution.Browse
The NGC Catalog is a curated set of GPU-optimized software. It consists of containers, pre-trained models, Helm charts for Kubernetes deployments and industry-specific AI toolkits with software development kits (SDKs). The content provided by NVIDIA and third-party ISVs simplify the building, customizing and integration of GPU-optimized software into workflows, accelerating the time to solutions for users.Browse
Deploy Assets from NGC
NVIDIA tests NGC containers running AI, ML and DL workloads on NVIDIA GPUs on leading public clouds and on-prem servers through its NVIDIA certification programs. NVIDIA certified data center and edge servers, together with public cloud platforms, enable easy deployment of any NGC asset, in environments certified for performance and scalability by NVIDIA.Browse
The NGC private registry provides you with a secure space to store and share custom containers, models, resources and helm charts within your enterprise. Take advantage of the deployment patterns you love from the Catalog -- but with your bespoke assets.Browse
NVIDIA NGX Documentation
NVIDIA NGX makes it easy to integrate pre-built, AI-based features into applications with the NGX SDK, NGX Core Runtime and NGX Update Module. The NGX infrastructure updates the AI-based features on all clients that use it.
NVIDIA Nsight Developer Tools Documentation
NVIDIA Nsight Developer Tools is a comprehensive tool suite spanning across desktop and mobile targets which enable developers to build, debug, profile, and develop class-leading and cutting-edge software that utilizes the latest visual computing hardware from NVIDIA.
NVIDIA Nsight Systems
NVIDIA Nsight Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you identify the largest opportunities to optimize, and tune to scale efficiently across any quantity or size of CPUs and GPUs; from a large server to our smallest SoC.Browse
NVIDIA Nsight Compute
NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging via a user interface and command-line tool.Browse
NVIDIA Nsight Graphics
NVIDIA Nsight Graphics is a standalone developer tool that enables you to debug, profile, and export frames built with Direct3D, Vulkan, OpenGL, OpenVR, and the Oculus SDK.Browse
NVIDIA Nsight Deep Learning Designer
NVIDIA Nsight Deep Learning Designer is a tool with an integrated development environment that helps developers efficiently design and develop deep neural networks for in-app inference.Browse
NVIDIA Nsight Visual Studio Edition (VSE)
NVIDIA Nsight Visual Studio Edition (VSE) is an application development environment for heterogeneous platforms that brings GPU computing into Microsoft Visual Studio.Browse
NVIDIA Nsight Visual Studio Code Edition (VSCE)
NVIDIA Nsight Visual Studio Code Edition (VSCE) is an application development environment for heterogeneous platforms that brings CUDA development for GPUs into Microsoft Visual Studio Code.Browse
NVIDIA Nsight Integration
NVIDIA Nsight Integration is a Visual Studio extension that enables you to access the power of Nsight Compute, Nsight Graphics, and Nsight Systems from within Visual Studio.Browse
NVIDIA Nsight Eclipse Edition
NVIDIA Nsight Eclipse Edition is a unified CPU plus GPU integrated development environment (IDE) for developing CUDA® applications on Linux and Mac OS X for the x86, POWER and ARM platforms.Browse
NVIDIA Nsight Perf SDK
The NVIDIA Nsight Perf SDK is a toolbox for collecting and analyzing GPU performance data, directly from application code.Browse
NVIDIA CUDA-GDB is a console-based debugging interface you can use from the command-line on your local system or any remote system on which you have Telnet or SSH access.Browse
NVIDIA Compute Sanitizer
NVIDIA Compute Sanitizer is a functional correctness checking tool suite included in the CUDA Toolkit. This suite contains multiple tools that can perform different types of checks.Browse
NVIDIA CUPTI (CUDA Profiling Tools Interface) is a set of APIs that enables the creation of profiling and tracing tools that target CUDA applications.Browse
The NVIDIA Tools Extension (NVTX) library is a set of functions that a developer can use to provide additional information to tools. The additional information is used by the tool to improve analysis and visualization of data.Browse
Legacy Developer Tools
NVIDIA Nsight Tegra, Visual Studio Edition
NVIDIA Nsight Tegra, Visual Studio Edition brings the raw development power and efficiency of Microsoft Visual Studio to Android, giving you the right tools for the job.Browse
NVIDIA System Profiler
NVIDIA System Profiler is a system trace and multi-core CPU call stack sampling profiler, providing an interactive view of the system behavior to help you optimize the application performance on Jetson devices.Browse
NVIDIA Perfkit is a comprehensive suite of performance tools to help debug and profile OpenGL and Direct3D applications.Browse
The nvprof profiling tool enables you to collect and view profiling data from the command-line. Nvprof enables the collection of a timeline of CUDA-related activities on both CPU and GPU, including kernel execution, memory transfers, memory set and CUDA API calls and events or metrics for CUDA kernels.Browse
NVIDIA Visual Profiler
The NVIDIA Visual Profiler is a graphical profiling tool that displays a timeline of your application's CPU and GPU activity. It includes an automated analysis engine to identify optimization opportunities.Browse
CUDA-MEMCHECK is a functional correctness checking suite included in the CUDA toolkit. The memcheck tool is capable of precisely detecting and attributing out of bounds and misaligned memory access errors in CUDA applications.Browse
NVIDIA Omniverse Documentation
NVIDIA Omniverse is a cloud-native, multi-GPU, real-time simulation and collaboration platform for 3D production pipelines based on Pixar's Universal Scene Description (USD) and NVIDIA RTX.
NVIDIA Optimized Frameworks Documentation
NVIDIA Optimized Frameworks such as Kaldi, NVIDIA Optimized Deep Learning Framework (powered by Apache MXNet), NVCaffe, PyTorch, and TensorFlow (which includes DLProf and TF-TRT) offer flexibility with designing and training custom (DNNs for machine learning and AI applications.
NVIDIA RAPIDS Documentation
The RAPIDS data science framework is a collection of libraries for running end-to-end data science pipelines completely on the GPU. The interaction is designed to have a familiar look and feel to working in Python, but utilizes optimized NVIDIA CUDA primitives and high-bandwidth GPU memory under the hood.
NVIDIA Ray-Tracing Documentation
Reference documentation, examples, and tutorials for the NVIDIA OptiX ray-tracing engine, the Iray rendering system, and the Material Definition Language (MDL).
NVIDIA Iray rendering technology represents a comprehensive approach to state-of-the-art rendering for design visualization.Browse
NVIDIA Iray Server
NVIDIA Iray Server is a network-attached rendering solution for Iray-compatible applications.Browse
NVIDIA Material Definition Language (MDL)
NVIDIA Material Definition Language (MDL) is a domain-specific language that describes the appearance of scene elements for a rendering process.Browse
NVIDIA IndeX is a 3D volumetric, interactive visualization SDK used by scientists and researchers to visualize and interact with massive datasets.Browse
The NVIDIA OptiX ray-tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures.Browse
NVIDIA Riva Speech Skills Documentation
NVIDIA Riva is an SDK for building multimodal conversational systems. Riva is used for building and deploying AI applications that fuse vision, speech, sensors, and services together to achieve conversational AI use cases that are specific to a domain of expertise. It offers a complete workflow to build, train, and deploy AI systems that can use visual cues such as gestures and gaze along with speech in context.
NVIDIA TAO Toolkit Documentation
The NVIDIA TAO Toolkit eliminates the time-consuming process of building and fine-tuning DNNs from scratch for IVA applications.
NVIDIA TensorRT Documentation
NVIDIA TensorRT is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. The core of NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA GPUs. TensorRT takes a trained network, which consists of a network definition and a set of trained parameters, and produces a highly optimized runtime engine that performs inference for that network.
NVIDIA Transformer Engine Documentation
Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs to provide better performance with lower memory utilization in both training and inference, and an FP8 automatic-mixed-precision-like API that can be used seamlessly with your model code.
NVIDIA Triton Inference Server Documentation
NVIDIA Triton Inference Server (formerly TensorRT Inference Server) provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.
NVIDIA Video Technologies Documentation
Reference documentation, APIs, and samples for NVIDIA video technology SDKs on Windows and Linux platforms.
NVIDIA Optical Flow SDK
The NVIDIA Optical Flow SDK provides a comprehensive set of APIs, samples, and documentation on Windows and Linux platforms for fully hardware-accelerated optical flow, which can be used for computing the relative motion of pixels between images.Browse
NVIDIA Video Codec SDK
The NVIDIA Video Codec SDK provides a comprehensive set of APIs, samples, and documentation for fully hardware-accelerated video encoding, decoding, and transcoding on Windows and Linux platforms.Browse
NVIDIA Virtual GPU (vGPU) Software Documentation
NVIDIA virtual GPU (vGPU) software is a graphics virtualization platform that extends the power of NVIDIA GPU technology to virtual desktops and apps, offering improved security, productivity, and cost-efficiency.
NVIDIA Virtual Reality Capture and Replay (VCR) SDK Documentation
The NVIDIA Virtual Reality Capture and Replay (VCR) SDK enables developers and users to accurately capture and replay VR sessions for performance testing, scene troubleshooting, and more.