Creating Your First NVIDIA AI Enterprise System#

Added in version 2.0.

Installing an Operation System#

Installing Ubuntu Server 20.04 LTS or 22.04 LTS#

NVIDIA AI Enterprise is supported on Ubuntu 20.04 LTS or 22.04 LTS operating systems. It is important to note there are two Ubuntu ISO types: Desktop and Live Server. The Desktop version includes a graphical user interface (GUI), while the Live Server version only operates via a command line. This document uses the Live Server version 20.04 (amd64 architecture) of Ubuntu, though it is worth noting a GUI may be installed later if needed.

  1. Attach the Ubuntu ISO to your host server’s virtual media.

  2. Select your preferred language and press the enter key.

    _images/dg-first-vm-14.png
  3. Updating the installer is optional. The guide was created with Ubuntu 20.04.1 LTS, some options may change on newer versions of the installer.

    _images/dg-first-vm-15.png
  4. Configure the keyboard layout and press the enter key.

    _images/dg-first-vm-16.png
  5. On this screen, select your network connection type and modify it to fit your internal requirements. This guide uses DHCP for the configuration.

    _images/dg-first-vm-17.png
  6. If you have a proxy address, input it in this screen and press Done.

    _images/dg-first-vm-18.png
  7. If you have an alternative mirror address for Ubuntu, input it here. Otherwise, if there is a default address, use it and press Done.

    _images/dg-first-vm-19.png
  8. Format the entire disk. Then, select a disk to install.

    _images/dg-first-vm-20.png
  9. Review the file system summary and select Done if satisfactory. Select Continue in the pop-up window.

    _images/dg-first-vm-21.png
  10. Configure the system with a user account, name, and password.

    _images/dg-first-vm-22.png
  11. Select Install OpenSSH server and select Done.

    _images/dg-first-vm-23.png
  12. Select any server snaps that may be required for internal use in your environment and select Done. Wait for the system to finish installing.

    _images/dg-first-vm-24.png
  13. Select Reboot Now on the Ubuntu OS screen.

    _images/dg-first-vm-25.png
  14. Disconnect the Ubuntu ISO to your host server’s virtual media.

Installing Red Hat Enterprise Linux 8.4#

  1. Attach the Red Hat Enterprise Linux (RHEL) ISO to your host server’s virtual media.

  2. Select your preferred language and Continue.

    _images/rhel-keyboard.png
  3. Next, select Time & Date under the Localization column. Set the time and date as required and click Done.

    _images/rhel-date-and-time.png
  4. Next, select Software Packages under the Software column. Select Server and click Done.

    _images/rhel-software-selection.png
  5. Next, select Installation Destination under the System Menu. Select the VMware Virtual disk and click Done.

    _images/rhel-installation-destination.png
  6. Next, select Network & Host Name under the System column. If your system is connected to a network, then it will try to get IP from DHCP server otherwise it can be configured manually. Click Done when finished.

    _images/rhel-network-host-name.png
  7. Select Root Password under the User Settings Column. Create a password and click Done.

    _images/rhel-root-password.png
  8. Click Begin Installation to start the install.

    _images/rhel-begin-install.png
  9. The installation will begin as shown below.

    _images/rhel-install.png
  10. Once the installation is completed reboot by clicking the Reboot System.

    _images/rhel-reboot.png
  11. Disconnect the Ubuntu ISO to your host server’s virtual media.

Installing the NVIDIA Driver#

The NVIDIA driver is the software driver that is installed on the OS and is responsible for communicating with the NVIDIA GPU to enable accelerated AI or HPC workloads. Now that you have installed Linux, the NVIDIA AI Enterprise Driver will fully enable GPU operation. Before proceeding with the NVIDIA Driver installation, please confirm that Nouveau is disabled. Instructions to confirm this are located in the Ubuntu section and the RHEL section.

Data Center Driver Installation#

This driver is intended for use in Baremetal settings or with the use of GPU Passthrough Mode in a VM for accelerated AI or HPC workloads. This driver is not to be used in vGPU settings.

Installation of the NVIDIA AI Enterprise software driver for Linux requires:

  • Compiler toolchain

  • Kernel headers

Note

If you prefer to use a Debian package, refer to the Debian instructions.

To enable NVIDIA GPU acceleration for compute and AI workloads running in data centers:

  1. Download the NVIDIA GPU data center drivers from this location.

  2. Select Data Center/Tesla, GPU series**, and Linux 64-bit to download the .run file.

    Note

    To use the .deb driver file, select Data Center/Tesla, GPU series, and then select Linux 64-bit 22.04.

  3. Log into the system and check for updates.

    sudo apt-get update
    
  4. Install the GCC compiler and the make tool in the terminal.

    sudo apt-get install build-essential
    
  5. Copy the NVIDIA AI Enterprise Linux driver package, for example, NVIDIA-Linux-x86_64-550.90.12.run, to the host machine where you are installing the driver.

  6. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    sudo chmod +x NVIDIA-Linux-x86_64-xxx.xx.xx.run
    
  7. From a console shell, run the driver installer as the root user, and accept the defaults.

    sudo sh ./NVIDIA-Linux-x86_64-xxx.xx.xx.run
    
  8. Reboot the system.

    sudo reboot
    
  9. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    nvidia-smi