Overview

NVIDIA AI Enterprise 3.0 or later

This document provides insights into deploying NVIDIA AI Enterprise on Red Hat Enterprise Linux (RHEL) with KVM Virtualization and serves as a technical resource for understanding system prerequisites, installation, and configuration.

The sections within this guide were written in the following installation order:
  • Prerequisites

  • Installing Red Hat Enterprise Linux on the Host Server

  • Initial Host Configuration

  • NVIDIA AI Enterprise Software

  • NVIDIA License System

  • Enabling KVM Virtualization for your NVIDIA AI Enterprise system

  • Creating Virtual Machines with NVIDIA AI Enterprise on KVM for RHEL

  • Setting Up NVIDIA vGPU Devices

  • Installing Podman and NVIDIA Container Toolkit

  • Installing AI and Data Science Applications and Frameworks

  • Advanced GPU Configuration (Optional)

  • Advanced Framework Configuration

  • Validation

Kernel-based Virtual Machine (KVM) is an open source virtualization technology that is distributed with the Linux kernel as a loadable kernel module. The KVM kernel module implements a type-1 (bare-metal) hypervisor that allows a host machine to run multiple, isolated virtual machines (VMs). Since the KVM module is included in every Linux kernel release, it immediately benefits from every new feature, fix, and advancement without additional engineering.

Every VM in KVM is implemented as a regular Linux process, scheduled by the standard Linux scheduler, with dedicated virtual hardware like a network card, graphics adapter, CPU(s), memory, disks, and the ability to leverage NVIDIA virtual GPU (vGPU) technology.

Implementing KVM on RHEL with NVIDIA AI Enterprise allows any business, including organizations that may initially lack AI expertise, to extend KVM’s capabilities and harness the power of AI. KVM allows users to swap resources among guests, share common libraries, optimize system performance, and deploy AI frameworks that simplify building, sharing, and deploying AI software.

KVM is secured by a dual combination of security-enhanced Linux (SELinux) and secure virtualization (sVirt) for enhanced VM security and isolation. It utilizes storage of any kind supported by Linux, and also supports shared file systems so VM images may be shared by multiple hosts. Disk images support thin provisioning, allocating storage on demand rather than all up front. KVM inherits the performance of Linux, scaling to match demand load if the number of guest machines and requests increases. Once KVM is combined with the NVIDIA Software Suite, organizations will gain access to easy-to-use tools for every stage of the AI workflow, from data prep to training, inferencing, and deploying at scale.

Tip

This guide will frequently refer to Red Hat’s KVM Virtualization Guide. Please refer to this guide if there are any topics not covered within this document’s scope.

© Copyright 2022-2023, NVIDIA. Last updated on Sep 11, 2023.