Prerequisites#
Before you begin using the OpenFold3 NIM, ensure the following requirements described on this page are met.
Begin with a Linux Distribution that supports NVIDIA Driver >=580
The OpenFold3 NIM runs CUDA 13.0, which requires NVIDIA Driver >=580, see compatibility matrix
We have confirmed the installation and setup workflow below, on systems (a) with Ubuntu 22.04, (b) without NVSwitch. For systems with NVSwitch, fabricmanager may be needed.
Install NVIDIA Drivers - minimum version: 580
Install Docker - minimum version: 23.0.1
Install the NVIDIA Container Toolkit - minimum version: 1.13.5
Known issues#
There are known issues with NVIDIA Driver 580.95.05 on Hopper GPUs subrevision 3.
Installing NVIDIA Drivers#
To install NVIDIA drivers, you can use either your local or remote machine.
Option 1: Interactive Installation (Local Machine)#
Visit the NVIDIA Drivers download page and use the dropdown menu to select your GPU and operating system to download the appropriate driver.
Option 2: Command Line Installation (Remote Machine/SSH)#
If you’re working on a remote machine over SSH, you can download and install the driver using command line tools:
Determine your GPU model and OS version:
# Check GPU model
lspci | grep -i nvidia
# Example output:
# 00:1e.0 3D controller: NVIDIA Corporation GH100 [H100 PCIe] (rev a1)
# Check OS version
cat /etc/os-release
# Example output for Ubuntu:
# NAME="Ubuntu"
# VERSION="22.04.3 LTS (Jammy Jellyfish)"
# ID=ubuntu
# VERSION_ID="22.04"
Find the driver 580+ download link:
a. On your local machine (with browser), visit the NVIDIA Drivers download page
b. Select your GPU information:
Product Type: Tesla (for datacenter GPUs like H100, A100) or GeForce (for consumer GPUs)
Product Series: H100, A100, or your specific GPU series
Operating System: Linux 64-bit
Download Type: Production Branch
Language: English (US)
c. Click Search to find driver version 580 or higher
d. On the results page, right-click the Download button and select Copy Link Address
e. The link will look like:
https://us.download.nvidia.com/tesla/580.95.05/NVIDIA-Linux-x86_64-580.95.05.runor for repository installation:
https://us.download.nvidia.com/tesla/580.95.05/nvidia-driver-local-repo-ubuntu2404-580.95.05_1.0-1_amd64.deb
Note
For Ubuntu/Debian, use the .deb package. For RHEL/CentOS, use the .rpm package. For other distributions, use the .run installer.
Direct driver URLs for common configurations:
For Ubuntu 24.04 (Noble):
# H100/A100 driver 580.95.05 https://us.download.nvidia.com/tesla/580.95.05/nvidia-driver-local-repo-ubuntu2404-580.95.05_1.0-1_amd64.debFor Ubuntu 22.04 (Jammy):
# H100/A100 driver 580.95.05 https://us.download.nvidia.com/tesla/580.95.05/nvidia-driver-local-repo-ubuntu2204-580.95.05_1.0-1_amd64.debFor RHEL 8/Rocky Linux 8:
# H100/A100 driver 580.95.05 https://us.download.nvidia.com/tesla/580.95.05/nvidia-driver-local-repo-rhel8-580.95.05-1.0-1.x86_64.rpm
Important
Always check the NVIDIA Driver Downloads page for the latest driver version compatible with your GPU and OS.
Note
The following commands are for Ubuntu 24.04. If you’re using Ubuntu 22.04 or other versions, replace ubuntu2404 in the URLs and paths with your version (e.g., ubuntu2204 for Ubuntu 22.04).
Use
wgetto download the driver on your remote machine:
# Example for Ubuntu 24.04 with driver version 580.95.05
wget https://us.download.nvidia.com/tesla/580.95.05/nvidia-driver-local-repo-ubuntu2404-580.95.05_1.0-1_amd64.deb
Note
Replace the URL with the appropriate driver version and distribution for your system. Use the URL you copied from step 2 or select from the common configurations listed above.
Install the local repository using
dpkg(for Ubuntu/Debian):
sudo dpkg -i nvidia-driver-local-repo-ubuntu2404-580.95.05_1.0-1_amd64.deb
For RHEL/CentOS/Rocky Linux:
sudo rpm -i nvidia-driver-local-repo-rhel8-580.95.05-1.0-1.x86_64.rpm
Update package lists and install the driver:
For Ubuntu/Debian:
# Copy the GPG key
sudo cp /var/nvidia-driver-local-repo-ubuntu2404-580.95.05/nvidia-driver-local-*-keyring.gpg /usr/share/keyrings/
# Update package cache
sudo apt-get update
# Install the driver
sudo apt-get install -y cuda-drivers
For RHEL/CentOS/Rocky Linux:
# Update package cache
sudo dnf clean all
sudo dnf makecache
# Install the driver
sudo dnf install -y cuda-drivers
After installation, reboot to load the new driver:
sudo reboot
After reboot, verify the driver is installed correctly:
nvidia-smi
You should see output showing your GPU(s) and driver version 580 or higher with CUDA version 13.0 or higher.
Example expected output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05    Driver Version: 580.95.05    CUDA Version: 13.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA H100 PCIe    Off  | 00001E:00:00.0  Off |                    0  |
| N/A   30C    P0    68W / 350W |      0MiB / 81559MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
Troubleshooting Driver Installation#
Driver version mismatch: If nvidia-smi shows an older driver version, ensure you’ve rebooted after installation.
CUDA version: The driver must support CUDA 13.0 or higher. Check the CUDA version in the nvidia-smi output.
Secure Boot: If you have Secure Boot enabled, you may need to sign the NVIDIA kernel modules or disable Secure Boot in your BIOS.
Previous driver versions: If you have older NVIDIA drivers installed, you may need to remove them first:
sudo apt-get remove --purge nvidia-*
sudo apt-get autoremove
Verifying GPU Access#
Verify your container runtime supports NVIDIA GPUs by running:
docker run --rm --gpus all ubuntu nvidia-smi
Example output:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 580.82.07 Driver Version: 580.82.07 CUDA Version: 13.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 41% 30C P8 1W / 260W | 2244MiB / 11264MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
Note
For more information on enumerating multi-GPU systems, refer to the NVIDIA Container Toolkit’s GPU Enumeration Docs
NGC (NVIDIA GPU Cloud) Account#
Docker log in with your NGC API key using
docker login nvcr.io --username='$oauthtoken' --password=${NGC_API_KEY}
NGC CLI Tool#
Download the NGC CLI tool for your OS.
Important
Use NGC CLI version
3.41.1or newer. Here is the command to install this on AMD64 Linux in your home directory:
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.41.3/files/ngccli_linux.zip -O ~/ngccli_linux.zip && \ unzip ~/ngccli_linux.zip -d ~/ngc && \ chmod u+x ~/ngc/ngc-cli/ngc && \ echo "export PATH=\"\$PATH:~/ngc/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile
Set up your NGC CLI Tool locally (You’ll need your API key for this!):
ngc config set
Note
After you enter your API key, you may see multiple options for the org and team. Select as desired or hit enter to accept the default.
Log in to NGC
You’ll need log in to NGC via Docker and set the NGC_API_KEY environment variable to pull images:
docker login nvcr.io
Username: $oauthtoken
Password: <Enter your NGC key here>
Then, set the relevant environment variables in your shell. You will need to set the NGC_API_KEY variable:
export NGC_API_KEY=<Enter your NGC key here>
Set up your NIM cache
The NIM needs a directory on your system called the NIM cache, where it can
(a) download the model artifact (checkpoints and TRT engines)
(b) read the model artifact if it has been previously downloaded
The NIM cache directory must
(i) reside on a disk with at least 15GB of storage
(ii) have a permission state that allows the NIM to read, write, an execute
The NIM cache directory can be set up as follows, if your home directory ‘~’ is on a disk with enough storage
## Create the NIM cache directory in a location with sufficent storage
mkdir -p ~/.cache/nim
## Set the NIM cache directory permissions to allow all (a) users to read, write, and execute (rwx)
sudo chmod -R a+rwx ~/.cache/nim
Now, you should be able to pull the container and download the model using the environment variables. To get started, see the quickstart guide.