Creating Your First NVIDIA AI Enterprise System

NVIDIA AI Enterprise 2.0 or later

Installing Ubuntu Server 20.04 LTS or 22.04 LTS

NVIDIA AI Enterprise is supported on Ubuntu 20.04 LTS or 22.04 LTS operating systems. It is important to note there are two Ubuntu ISO types: Desktop and Live Server. The Desktop version includes a graphical user interface (GUI), while the Live Server version only operates via a command line. This document uses the Live Server version 20.04 (amd64 architecture) of Ubuntu, though it is worth noting a GUI may be installed later if needed.

  1. Attach the Ubuntu ISO to your host server’s virtual media.

  2. Select your preferred language and press the enter key.

    dg-first-vm-14.png


  3. Updating the installer is optional. The guide was created with Ubuntu 20.04.1 LTS, some options may change on newer versions of the installer.

    dg-first-vm-15.png

  4. Configure the keyboard layout and press the enter key.

    dg-first-vm-16.png


  5. On this screen, select your network connection type and modify it to fit your internal requirements. This guide uses DHCP for the configuration.

    dg-first-vm-17.png


  6. If you have a proxy address, input it in this screen and press Done.

    dg-first-vm-18.png


  7. If you have an alternative mirror address for Ubuntu, input it here. Otherwise, if there is a default address, use it and press Done.

    dg-first-vm-19.png


  8. Format the entire disk. Then, select a disk to install.

    dg-first-vm-20.png


  9. Review the file system summary and select Done if satisfactory. Select Continue in the pop-up window.

    dg-first-vm-21.png


  10. Configure the system with a user account, name, and password.

    dg-first-vm-22.png


  11. Select Install OpenSSH server and select Done.

    dg-first-vm-23.png


  12. Select any server snaps that may be required for internal use in your environment and select Done. Wait for the system to finish installing.

    dg-first-vm-24.png


  13. Select Reboot Now on the Ubuntu OS screen.

    dg-first-vm-25.png


  14. Disconnect the Ubuntu ISO to your host server’s virtual media.

Installing Red Hat Enterprise Linux 8.4

  1. Attach the Red Hat Enterprise Linux (RHEL) ISO to your host server’s virtual media.

  2. Select your preferred language and Continue.

    rhel-keyboard.png


  3. Next, select Time & Date under the Localization column. Set the time and date as required and click Done.

    rhel-date-and-time.png


  4. Next, select Software Packages under the Software column. Select Server and click Done.

    rhel-software-selection.png


  5. Next, select Installation Destination under the System Menu. Select the VMware Virtual disk and click Done.

    rhel-installation-destination.png


  6. Next, select Network & Host Name under the System column. If your system is connected to a network, then it will try to get IP from DHCP server otherwise it can be configured manually. Click Done when finished.

    rhel-network-host-name.png


  7. Select Root Password under the User Settings Column. Create a password and click Done.

    rhel-root-password.png


  8. Click Begin Installation to start the install.

    rhel-begin-install.png


  9. The installation will begin as shown below.

    rhel-install.png


  10. Once the installation is completed reboot by clicking the Reboot System.

    rhel-reboot.png


  11. Disconnect the Ubuntu ISO to your host server’s virtual media.

Install Steps for CLS Scenario

This section will cover the steps required to properly install, configure, and license the NVIDIA driver for CLS users. If you have a DLS, please refer to the Install Steps for DLS Scenario section.

Installing the NVIDIA Driver

Now that you have installed Linux, the NVIDIA AI Enterprise Driver will fully enable GPU operation. Before proceeding with the NVIDIA Driver installation, please confirm that Nouveau is disabled. Instructions to confirm this are located in the Ubuntu section for Ubuntu and in the RHEL section.

Downloading the NVIDIA AI Enterprise Software Driver Using NGC
Important

Before you begin you will need to generate or use an existing API key.

  1. From a browser, go to https://ngc.nvidia.com/signin/email and then enter your email and password.

  2. In the top right corner, click your user account icon and select Setup.

  3. Click Get API Key to open the Setup > API Key page.

    Note

    The API Key is the mechanism used to authenticate your access to the NGC container registry.


  4. Click Generate API Key to generate your API key.

    Note

    A warning message appears to let you know that your old API key will become invalid if you create a new key.


  5. Click Confirm to generate the key.

  6. Your API key appears.

    Important

    You only need to generate an API Key once. NGC does not save your key, so store it in a secure place. (You can copy your API Key to the clipboard by clicking the copy icon to the right of the API key.)Should you lose your API Key, you can generate a new one from the NGC website. When you generate a new API Key, the old one is invalidated.


    1. Run the following commands to install the NGC CLI for either AMD64 or ARM64

    AMD64 Linux Install: The NGC CLI binary for Linux is supported on Ubuntu 16.04 and later distributions.

    • Download, unzip, and install from the command line by moving to a directory where you have execute permissions and then running the following command:

    Copy
    Copied!
                

    wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc

    ARM64 Linux Install: The NGC CLI binary for ARM64 is supported on Ubuntu 18.04 and later distributions.

    • Download, unzip, and install from the command line by moving to a directory where you have execute permissions and then running the following command:

    Copy
    Copied!
                

    wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_arm64.zip && unzip ngccli_arm64.zip && chmod u+x ngc-cli/ngc

    Note

    The NGC CLI installations for Windows NGC CLI, Arm64 MacOs, or Intel MacOs can be found here

    Important

    The installation instructions for both AMD64 and ARM64 are the same in the below sections.

    • Check the binary’s MD5 hash to ensure the file wasn’t corrupted during download.

    Copy
    Copied!
                

    $ md5sum -c ngc.md5

    • Add your current directory to path.

    Copy
    Copied!
                

    $ echo "export PATH=\"\$PATH:$(pwd)\"" >> ~/.bash_profile && source ~/.bash_profile

    • You must configure NGC CLI for your use so that you can run the commands. Enter the following command, including your API key when prompted.

    Copy
    Copied!
                

    $ ngc config set Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: ascii Enter org [no-org]. Choices: ['no-org']: Enter team [no-team]. Choices: ['no-team']: Enter ace [no-ace]. Choices: ['no-ace']: Successfully saved NGC configuration to /home/$username/.ngc/config

    • Download the NVIDIA AI Enterprise Software Driver.

Important

Follow the driver installation based on the operating system installed in the previous steps.

Installing the NVIDIA Driver using the .run file with Ubuntu

Installation of the NVIDIA AI Enterprise software driver for Linux requires:

  • Compiler toolchain

  • Kernel headers

  1. Log in to the system and check for updates.

    Copy
    Copied!
                

    $ sudo apt-get update


  2. Install the gcc compiler and the make tool in the terminal.

    Copy
    Copied!
                

    $ sudo apt-get install build-essential


  3. Download the NVIDIA AI Enterprise Software Driver.

    Copy
    Copied!
                

    $ ngc registry resource download-version "nvaie/vgpu_guest_driver_x_x:xxx.xx.xx"

    Note

    Where x_x:xxx.xx.xx is the current NVIDIA AI Enterprise version and driver version from NGC Enterprise Catalog.


  4. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    Copy
    Copied!
                

    $ cd vgpu_guest_driver_x_x:xxx.xx.xx $ sudo chmod +x NVIDIA-Linux-x86_64-xxx.xx.xx-grid.run

    Note

    Where x_x:xxx.xx.xx is the current NVIDIA AI Enterprise version and driver version from NGC Enterprise Catalog. Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  5. From a console shell, run the driver installer as the root user, and accept defaults.

    Copy
    Copied!
                

    $ sudo sh ./NVIDIA-Linux-x86_64-xxx.xx.xx-grid.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  6. Reboot the system.

    Copy
    Copied!
                

    $ sudo reboot


  7. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    Copy
    Copied!
                

    $ nvidia-smi


After installing the NVIDIA vGPU compute driver, you can license any NVIDIA AI Enterprise Software licensed products you are using.

Installing the NVIDIA Driver using the .run file with RHEL
Important

Before starting the driver install Secure Boot will need to be disabled as shown in Installing Red Hat Enterprise Linux 8.4 section.

  1. Register machine to RHEL using subscription-manager with the command below.

    Copy
    Copied!
                

    $ subscription-manager register


  2. Satisfy the external dependency for EPEL for DKMS.

    Copy
    Copied!
                

    $ dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm


  3. For RHEL 8, ensure that the system has the correct Linux kernel sources from the Red Hat repositories.

    Copy
    Copied!
                

    $ dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)

    Note

    The NVIDIA driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 4.4.0, the 4.4.0 kernel headers and development packages must also be installed.


  4. Install additional dependencies for NVIDIA drivers.

    Copy
    Copied!
                

    $ dnf install elfutils-libelf-devel.x86_64 $ dnf install -y tar bzip2 make automake gcc gcc-c++ pciutils libglvnd-devel


  5. Update the running kernel:

    Copy
    Copied!
                

    $ dnf install -y kernel kernel-core kernel-modules


  6. Confirm the system has the correct Linux kernel sources from the Red Hat repositories after update.

    Copy
    Copied!
                

    $ dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)


  7. Download the NVIDIA AI Enterprise Software Driver.

    Copy
    Copied!
                

    $ ngc registry resource download-version "nvaie/vgpu_guest_driver_x_x:xxx.xx.xx"

    Note

    Where x_x:xxx.xx.xx is the current NVIDIA AI Enterprise version and driver version from NGC Enterprise Catalog.


  8. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    Copy
    Copied!
                

    $ sudo chmod +x NVIDIA-Linux-x86_64-xxx.xx.xx-grid.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  9. From the console shell, run the driver installer and accept defaults.

    Copy
    Copied!
                

    $ sudo sh ./NVIDIA-Linux-x86_64-xxx.xx.xx-grid.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.

    Note

    Accept any warnings and ignore the CC version check


  10. Reboot the system.

    Copy
    Copied!
                

    $ sudo reboot


  11. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    Copy
    Copied!
                

    $ nvidia-smi


After installing the NVIDIA vGPU compute driver, you can license any NVIDIA AI Enterprise Software licensed products you are using.

Licensing the NVIDIA Driver

To use an NVIDIA software licensed product, each client system to which a physical or virtual GPU is assigned must be able to obtain a license from the NVIDIA License System. A client system can be a system that is configured with NVIDIA vGPU, a system that is configured for GPU pass through, or a physical host to which a physical GPU is assigned in a bare-metal deployment.

Install Steps for DLS Scenario

This section will cover the steps required to properly install, configure, and license the NVIDIA driver for DLS users. If you have a CLS, please refer to the install steps for the Install Steps for CLS Scenario section

Installing the NVIDIA Driver

Now that you have installed Linux, the NVIDIA AI Enterprise Driver will fully enable GPU operation. Before proceeding with the NVIDIA Driver installation, please confirm that Nouveau is disabled. Instructions to confirm this are located Ubuntu section for Ubuntu and RHEL section for RHEL.

Downloading the NVIDIA AI Enterprise Software Driver

Before you begin you will need to download the NVIDIA Driver.

Important

Follow the driver installation based on the operating system installed in the previous steps.

Installing the NVIDIA Driver using the .run file with Ubuntu

Installation of the NVIDIA AI Enterprise software driver for Linux requires:

  • Compiler toolchain

  • Kernel headers

  1. Log in to the system and check for updates.

    Copy
    Copied!
                

    $ sudo apt-get update


  2. Install the gcc compiler and the make tool in the terminal.

    Copy
    Copied!
                

    $ sudo apt-get install build-essential


  3. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    Copy
    Copied!
                

    $ cd NVIDIA-Linux-x86_64-xxx.xx.xx.run $ sudo chmod +x NVIDIA-Linux-x86_64-xxx.xx.xx.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  4. From a console shell, run the driver installer as the root user, and accept defaults.

    Copy
    Copied!
                

    $ sudo sh ./ NVIDIA-Linux-x86_64-xxx.xx.xx.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  5. Reboot the system.

    Copy
    Copied!
                

    $ sudo reboot


  6. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    Copy
    Copied!
                

    $ nvidia-smi


After installing the NVIDIA vGPU compute driver, you can license any NVIDIA AI Enterprise Software licensed products you are using.

Installing the NVIDIA Driver using the .run file with RHEL
Important

Before starting the driver install Secure Boot will need to be disabled as shown in Installing Red Hat Enterprise Linux 8.4 section.

  1. Register machine to RHEL using subscription-manager with the command below.

    Copy
    Copied!
                

    $ subscription-manager register


  2. Satisfy the external dependency for EPEL for DKMS.

    Copy
    Copied!
                

    $ dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm


  3. For RHEL 8, ensure that the system has the correct Linux kernel sources from the Red Hat repositories.

    Copy
    Copied!
                

    $ dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)

    Note

    The NVIDIA driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 4.4.0, the 4.4.0 kernel headers and development packages must also be installed.


  4. Install additional dependencies for NVIDIA drivers.

    Copy
    Copied!
                

    $ dnf install elfutils-libelf-devel.x86_64 $ dnf install -y tar bzip2 make automake gcc gcc-c++ pciutils libglvnd-devel


  5. Update the running kernel:

    Copy
    Copied!
                

    $ dnf install -y kernel kernel-core kernel-modules


  6. Confirm the system has the correct Linux kernel sources from the Red Hat repositories after update.

    Copy
    Copied!
                

    $ dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)


  7. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    Copy
    Copied!
                

    $ cd NVIDIA-Linux-x86_64-xxx.xx.xx.run $ sudo chmod +x NVIDIA-Linux-x86_64-xxx.xx.xx.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  8. From the console shell, run the driver installer and accept defaults.

    Copy
    Copied!
                

    ./ NVIDIA-Linux-x86_64-xxx.xx.xx.run

    Note

    Where xxx.xx.xx is the current driver version from NGC Enterprise Catalog.


  9. Reboot the system.

    Copy
    Copied!
                

    $ sudo reboot


  10. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    Copy
    Copied!
                

    $ nvidia-smi


After installing the NVIDIA vGPU compute driver, you can license any NVIDIA AI Enterprise Software licensed products you are using.

© Copyright 2022-2023, NVIDIA. Last updated on Sep 11, 2023.