Prerequisites#

Before you begin using the OpenFold3 NIM, ensure the following requirements described on this page are met.

We have confirmed the installation and setup workflow below, on

  • (a) systems with Ubuntu 22.04 / 24.04 and amd64 (x86_64) architectures

  • (b) systems with with Ubuntu 24.04 with arm64 (aarch64) architectures

  • (c) systems without NVSwitch. For systems with NVSwitch, fabricmanager may be needed.

Known issues#

NGC (NVIDIA GPU Cloud) Account#

  1. Create an account on NGC

  2. Generate an API Key

  3. Log in to the NVIDIA Container Registry, using your NGC API key as passord

    • NVIDIA docker images will be used to verify the NVIDIA Driver, CUDA, Docker, and NVIDIA Container Toolkit stack

    docker login nvcr.io --username='$oauthtoken'
    

NGC CLI Tool#

  1. Download the NGC CLI tool for your OS.

    Important

    Use NGC CLI version 3.41.1 or newer. Here is the command to install this on AMD64 Linux in your home directory:

    wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.41.3/files/ngccli_linux.zip -O ~/ngccli_linux.zip && \
    unzip ~/ngccli_linux.zip -d ~/ngc && \
    chmod u+x ~/ngc/ngc-cli/ngc && \
    echo "export PATH=\"\$PATH:~/ngc/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile
    
  2. Set up your NGC CLI Tool locally (You’ll need your API key for this!):

    ngc config set
    

    Note

    After you enter your API key, you may see multiple options for the org and team. Select as desired or hit enter to accept the default.

Set up your NIM cache#

The NIM needs a directory on your system called the NIM cache, where it can

  • Download the model artifact (checkpoints and TRT engines)

  • Read the model artifact if it has been previously downloaded

The NIM cache directory must:

  • Reside on a disk with at least 15GB of storage

  • Have a permission state that allows the NIM to read, write, an execute

The NIM cache directory can be set up as follows, if your home directory ‘~’ is on a disk with enough storage.

## Create the NIM cache directory in a location with sufficent storage
mkdir -p ~/.cache/nim

## Set the NIM cache directory permissions to allow all (a) users to read, write, and execute (rwx)
sudo chmod -R a+rwx ~/.cache/nim

Now, you should be able to pull the NIM container, refer to the Getting Started. You won’t be able to run the NIM until completing the installation of the NVIDIA Driver, CUDA, Docker, and the NVIDIA Container Toolkit.

Installing the NVIDIA Driver, CUDA, Docker, and NVIDIA Container Toolkit Stack#

Collect system information#

To collect the operating system and hardware information needed to select a driver installation method, follow the instructions below.

  1. Determine the OS version on your system:

    # Check OS version
    cat /etc/os-release
    # Example output for Ubuntu:
    # NAME="Ubuntu"
    # VERSION="24.04.3 LTS (Noble Numbat)"
    # ID=ubuntu
    # VERSION_ID="24.04"
    
    # Set OS version as environment variable for use in subsequent commands
    export OS_VERSION=$( . /etc/os-release && echo "$VERSION_ID" | tr -d '.' )
    echo "OS Version: $OS_VERSION"
    # Example output for Ubuntu 24.04:
    # OS Version: 2404
    
  2. Determine the GPU model on your system

    # Check GPU model
    nvidia-smi | grep -i "NVIDIA" | awk '{print $3, $4}'
    # Example output:
    # 580.105.08 Driver
    # H100 PCIe
    

    If you see a message like Command 'nvidia-smi' not found, then attempt to determine GPU model with the command below. This command may be less informative with later devices.

    # Check GPU model
    lspci | grep -i "3D controller"
    # Example output:
    # 01:00.0 3D controller: NVIDIA Corporation GH100 [H100 SXM5 80GB] (rev a1)
    
  3. Determine the CPU architecture on your system

    # Set CPU arch as environment variable, on Ubuntu/Debian system
    export CPU_ARCH=$(dpkg --print-architecture)
    echo "CPU_ARCH: ${CPU_ARCH}"
    # Example output:
    # amd64
    
    # Set CPU arch as environment variable, on a non-Ubuntu/Debian system
    export CPU_ARCH=$(uname -m)
    echo "CPU_ARCH: ${CPU_ARCH}"
    # Example output:
    # x86_64
    

Installation for amd64 systems#

If the CPU architecture identified in the previous step is amd64 or x86_64, complete the steps at NVIDIA Driver, CUDA, Docker, and NVIDIA Container Toolkit Installation for amd64 Systems.

Installation for arm64 DGX systems#

If the CPU architecture identified in the previous step is arm64 or aarch64, for example, your system is a GB200, complete the steps at NVIDIA Driver, CUDA, Docker, and NVIDIA Container Toolkit Installation for arm64 / aarch64 DGX Systems. Currently the OpenFold3 NIM is only supported on arm64 devices that require the DGX software stack.

Troubleshooting Driver Installation#

Driver version mismatch: If nvidia-smi shows an older driver version, ensure you’ve rebooted after installation.

CUDA version mismatch: The driver must support CUDA 13.0 or higher. Check the CUDA version in the nvidia-smi output. If your system shows CUDA 12.x, you need to install CUDA Toolkit 13.0 or higher.

To install or update CUDA Toolkit:

  1. Follow the download instructions as described in NVIDIA CUDA Downloads

  2. Select your operating system and architecture

  3. Follow the installation instructions for CUDA Toolkit 13.0 or later

Secure Boot: If you have Secure Boot enabled, you may need to sign the NVIDIA kernel modules or disable Secure Boot in your BIOS.

Previous driver versions: If you have older NVIDIA drivers installed, you may need to remove them first:

sudo apt-get remove --purge nvidia-*
sudo apt-get autoremove