Creating Your First NVIDIA AI Enterprise VM

Because C-Series vGPUs have large BAR memory settings, using these vGPUs has some restrictions on VMware ESXi.

  • The guest OS must be a 64-bit OS.

  • 64-bit MMIO and EFI boot must be enabled for the VM.

  • The guest OS must be able to be installed in EFI boot mode.

  • The VM’s MMIO space must be increased to 128 GB as explained in VMware Knowledge Base Article: VMware vSphere VMDirectPath I/O: Requirements for Platforms and Devices (2142307).

Creating a Virtual Machine

These instructions are to assist in making a VM from scratch that will support NVIDIA vGPU. Later, the VM will be used as a gold master image. Use the following procedure to configure a vGPU for a single guest desktop:

  1. Browse to the host or cluster using the vSphere Web Client.

    ../_images/dg-first-vm-01.png
  2. Right-click the desired host or cluster and select New Virtual Machine.

    ../_images/dg-first-vm-02.png
  3. Select Create a new virtual machine and click Next.

    ../_images/dg-first-vm-03.png
  4. Enter a name for the virtual machine. Next, choose the location to host the virtual machine using the Select a location for the virtual machine section. Click Next to continue.

    ../_images/dg-first-vm-04.png
  5. Select a compute resource to run the VM. Click Next to continue.

    Note

    This compute resource should include an NVIDIA vGPU enabled card installed and be correctly configured.

    ../_images/dg-first-vm-05.png
  6. Select the datastore to host the virtual machine. Click Next to continue.

    ../_images/dg-first-vm-06.png
  7. Next, select compatibility for the virtual machine. This should reflect the ESXi version for your NVIDIA-Certified Systems. Click Next to continue.

    ../_images/dg-first-vm-07.png
  8. Select the appropriate Ubuntu Linux OS from the Guest OS Family and Guest OS Version pull-down menus. Click Next to continue.

    ../_images/dg-first-vm-08.png
  9. Customize hardware is next. Set the virtual hardware based on your compute workload requirements. Click Next to continue.

    ../_images/dg-first-vm-09.png
  10. Review the New Virtual Machine configuration before completion. Click Finish when ready.

    ../_images/dg-first-vm-10.png
  11. The new virtual machine container is created.

  12. Configure the VM boot options for EFI. Right-click on the new VM and select Edit Settings.

    ../_images/dg-first-vm-11.png
  13. Click on the VM Options tab, expand Boot Options, change the Firmware from BIOS to EFI.

    ../_images/dg-first-vm-12.png
  14. Expand Advanced and select Edit Configuration.

    ../_images/dg-first-vm-35.png
  15. Enable 64-bit Memory Mapped I/O (MMIO) and specify the size of the MMIO region to 128 GB in the virtual machine.

    ../_images/dg-first-vm-36.png
  16. Click Okay to complete the VM configuration.

Installing Ubuntu Server 20.04 LTS (Focal Fossa)

NVIDIA AI Enterprise is supported on Ubuntu 20.04 LTS operating systems. It is important to note there are two Ubuntu ISO types: Desktop and Live Server. The Desktop version includes a graphical user interface (GUI), while the Live Server version only operates via a command line. This document uses the Live Server version 20.04 (amd64 architecture) of Ubuntu, though it is worth noting a GUI may be installed later if needed.

  1. Upload the ISO to the datastore of your VM. Right-click on the VM container in vSphere Client and select Edit Settings. Mount the ISO to your VM by clicking Browse and make sure to check Connect At Power On. Click Okay to finish.

    ../_images/dg-first-vm-13.png
  2. Power on the VM and wait for the installation screen to appear.

  3. Select your preferred language and press the enter key.

    ../_images/dg-first-vm-14.png
  4. Continue without updating as this guide is built around 20.04.

    ../_images/dg-first-vm-15.png
  5. Configure the keyboard layout and press the enter key.

    ../_images/dg-first-vm-16.png
  6. On this screen, select your network connection type and modify it to fit your internal requirements. This guide uses DHCP for the configuration.

    ../_images/dg-first-vm-17.png
  7. If you have a proxy address, input it in this screen and press Done.

    ../_images/dg-first-vm-18.png
  8. If you have an alternative mirror address for Ubuntu, input it here. Otherwise, if there is a default address, use it and press Done.

    ../_images/dg-first-vm-19.png
  9. Format the entire disk. Then, select a disk to install.

    ../_images/dg-first-vm-20.png
  10. Review the file system summary and select Done if satisfactory. Select Continue in the pop-up window.

    ../_images/dg-first-vm-21.png
  11. Configure the VM with a user account, name, and password.

    ../_images/dg-first-vm-22.png
  12. Select Install OpenSSH server and select Done.

    ../_images/dg-first-vm-23.png
  13. Select any server snaps that may be required for internal use in your environment and select Done. Wait for the system to finish installing.

    ../_images/dg-first-vm-24.png
  14. When the installation is complete, removed the ISO from the VM setting dialog with vCenter and then reboot (this may take several minutes to complete). Once finished, log in using the credentials previously set. VMware Tools will be installed and managed by the Ubuntu server.

    ../_images/dg-first-vm-25.png

Enabling the NVIDIA vGPU

Use the following procedure to enable vGPU support for your virtual machine. You must edit the virtual machine settings.

  1. Power down the virtual machine.

    ../_images/dg-first-vm-26.png
  2. Click on the VM in the Navigator window. Right-click the VM and select Edit Settings.

    ../_images/dg-first-vm-27.png
  3. Click on the New Device bar and select PCI device.

    ../_images/dg-first-vm-28.png
  4. Click on Add to continue.

    ../_images/dg-first-vm-29.png
  5. Select the desired GPU Profile underneath the New PCI device drop-down.

    ../_images/dg-first-vm-30.png

    Note

    NVIDIA AI Enterprise requires a C-series profile.

  6. Click OK and power on the VM.

Note

A single VM may have multiple GPU (PCI devices) attached, however, this requires that each GPU be configured with maximum memory allocation.

Installing the NVIDIA Driver in the Ubuntu Virtual Machine

After you create a Linux VM on the hypervisor and boot the VM, install the NVIDIA vGPU software display driver in the VM to fully enable GPU operation.

Important

Before proceeding with the NVIDIA Driver installation, please confirm that Nouveau is disabled. Instructions to confirm this are located here.

Transferring the NVIDIA Driver to the Ubuntu VM

  • Two options are provided for driver installs in the VM:

    • NNVIDIA-Linux-x86_64-470.63.01-grid.run

    • nvidia-linux-grid_470.63.01_amd64.deb

  • Download the WinSCP program for transferring files from a physical machine to a virtual machine.

  • On the virtual machine, check ifconfig in the terminal and find the inet address field of the output for the VM’s IP.

  • Input this IP and the VM login credentials into the WinSCP window and acknowledge any first-time popup windows.

  • Once connected and logged in, click and drag to transfer the driver file, ie. NVIDIA-Linux-x86_64-470.63.01-grid.run, from the physical machine directory to the virtual machine.

  • Close the WinSCP session. The file has been successfully transferred.

Installing the NVIDIA Driver using the .run file

Installation of the NVIDIA AI Enterprise software driver for Linux requires:

  • Compiler toolchain

  • Kernel headers

  1. Log in to the VM and check for updates.

    $ sudo apt-get update
    
  2. Install the gcc compiler and the make tool in the terminal.

    $ sudo apt-get install build-essential
    
  3. Navigate to the directory containing the NVIDIA Driver .run file. Then, add the Executable permission to the NVIDIA Driver file using the chmod command.

    $ sudo chmod +x NVIDIA-Linux-x86_64-470.63.01-grid.run
    
  4. From a console shell, run the driver installer as the root user, and accept defaults.

    $ sudo sh ./NVIDIA-Linux-x86_64-470.63.01-grid.run
    
  5. Reboot the system.

    $ sudo reboot
    
  6. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    $ nvidia-smi
    

After installing the NVIDIA vGPU compute driver, you can license any NVIDIA AI Enterprise Software licensed products you are using.

Installing the NVIDIA Driver using the .deb file

This install option provides a Debian package instead of the .run file. With this install option, all dependencies are pulled automatically.

  1. Log in to the VM and navigate to the directory containing the NVIDIA Driver .deb package.

  2. From a console shell, run the driver installer as the root user.

    $ sudo apt-get install ./nvidia-linux-grid_470.63.01_amd64.deb
    
  3. Reboot the system.

    $ sudo reboot
    
  4. After the system has rebooted, confirm that you can see your NVIDIA vGPU device in the output from nvidia-smi.

    $ nvidia-smi
    

Licensing the Ubuntu VM

To use an NVIDIA vGPU software licensed product, each client system to which a physical or virtual GPU is assigned must be able to obtain a license from the NVIDIA License System. A client system can be a VM that is configured with NVIDIA vGPU, a VM that is configured for GPU pass through, or a physical host to which a physical GPU is assigned in a bare-metal deployment.