ESMFold NIM

Important

NVIDIA NIM currently is in limited availability, sign up here to get notified when the latest NIMs are available to download.

ESMFold is a protein structure prediction deep learning model used for drug discovery. ESMFold uses Meta AI’s ESM2 protein language model and can estimate the 3D structure of a protein based on a single amino acid sequence, without requiring examples of several similar sequences.

ESMfold model provides:

  • Ultrafast, 3D protein structure prediction based on ESM-2 embeddings without multiple sequence alignment (MSA).

  • Significantly faster inference times for protein structure prediction are nearly as accurate as alignment-based methods.

  • Learns protein sequences through unsupervised pretraining and can predict the structure of a single protein sequence without requiring many homologous sequences as input.

_images/esmfold-example.png

Example output.

Note

A more detailed description of the model can be found in the Model Card.

More information can be found at NGC Collections.

Model Specific Requirements

The following are specific requirements for ESMFold NIM.

Important

Please refer to NVIDIA NIM documentation for necessary hardware, operating system, and software prerequisites if you have not done so already

Hardware

  • A100 80 GB

  • A6000 48 GB

Note

Expected to be broadly compatible with all NVIDIA GPUs with at least 24 GB of memory.

Software

  • Minimum Driver version: 535.104.05

Once the above requirements have been met, you will use the Quickstart Guide to pull the NIM container and model, perform a health check and then run inference.

Quickstart Guide

The following QuickStart guide is provided to quickly get ESMFold NIM up and running. You will download the ESMFold model from NGC and then the model weights from Facebook AI Research (FAIR).

Please refer to the detailed instruction section for additional information if needed.

Note

This page assumes Prerequisite Software (Docker, NGC CLI, NGC registry access) is installed and set up.

ESMFold requires the download of a Triton model repository from NGC followed by the model weights from FAIR.

Warning

DISCLAIMER: Each user is responsible for checking the content of models and the applicable licenses and determining if suitable for the intended use.

  1. Pull the ESFold NIM container.

    docker pull nvcr.io/nvidia/nim/bionemo_esmfold_nim:24.03.01
    
  2. Pull the ESMFold model from NGC.

    1mkdir -p ~/esmfold-nim/{weights,models}
    2ngc registry model download-version nvidia/nim/bionemo-esmfold:protein-folding_noarchx1_bf16_24.03 --dest ~/esmfold-nim/models/
    
  3. Download the three model weights files.

    1. Download the folding module weights.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/models/esmfold_3B_v1.pt --output ~/esmfold-nim/weights/esmfold_3B_v1.pt
      
    2. Download the ESM2 3B language model weights.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t36_3B_UR50D.pt --output ~/esmfold-nim/weights/esm2_t36_3B_UR50D.pt
      
    3. Download the contact regression model weights.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/regression/esm2_t36_3B_UR50D-contact-regression.pt --output ~/esmfold-nim/weights/esm2_t36_3B_UR50D-contact-regression.pt
      
  4. Run the NIM container.

    1CUDA_VISIBLE_DEVICES=0 && docker run --rm -it --runtime=nvidia \
    2-e CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES} \
    3--shm-size=2G --name bionemo-esmfold \
    4--ulimit memlock=-1 --ulimit stack=67108864 \
    5-v ~/esmfold-nim/models/bionemo-esmfold_vprotein-folding_noarchx1_bf16_24.03/:/config/models \
    6-v ~/esmfold-nim/weights/:/esm_models \
    7-e MODEL_PATH=/config/models \
    8-p 8008:8008 \
    9nvcr.io/nvidia/nim/bionemo_esmfold_nim:24.03.01
    
  5. Wait until the HTTP health check returns true before proceeding. This may take a couple of minutes.

    1curl localhost:8008/health/ready
    2...
    3true
    
  6. Request Inference from the local NIM instance.

    1curl -X 'POST' \
    2   'http://localhost:8008/protein-structure/esmfold/predict' \
    3   -H 'accept: application/json' \
    4   -H 'Content-Type: application/json' \
    5   -d '{
    6   "sequence": "MAGEGDQQDAAHNMGNHLPLLPAESEEEDEMEVEDQDSKEAKKPNIINFDTSLPTSHTYLGADMEEFHGRTLHDDDSCQVIPVLPQVMMILIPGQTLPLQLFHPQEVSMVRNLIQKDRTFAVLAYSNVQEREAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSDGIQQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQCSYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLREWDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQRLRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVNPHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICASHIGWKFTATKKDMSPQKFWGLTRSALLPTIPDTEDEISPDKVILCL"
    7   }'
    

Available Models

Version

Contents

Artifact Size

Precision

Memory Footprint

nvidia/nim/protein-folding_noarchx1_bf16_24.03

Triton Model Repository

1.06 MB

BF16

11 GB

esmfold_3B_v1.pt

Model Weights

2.6 GB

FP32

N/A

esm2_t36_3B_UR50D.pt

Model Weights

5.3 GB

FP32

N/A

esm2_t36_3B_UR50D-contact-regression.pt

Model Weights

6.7 KB

FP32

N/A

Important

If BF16 is not supported on the GPU, the model will load in FP32 precision.

Detailed Instructions

This section provides additional details outside of the scope of the Quickstart Guide.

First, we will define bash variables that we will reuse throughout these instructions:

1MODEL_NAME="bionemo-esmfold"
2MODEL_DIRECTORY=~/esmfold-nim/models
3WEIGHTS_DIRECTORY=~/esmfold-nim/weights
4mkdir -p ${MODEL_DIRECTORY}
5mkdir -p ${WEIGHTS_DIRECTORY}

Pull Container Image

  1. Container image tags follow the versioning of YY.MM, similar to other container images on NGC. You may see different values under “Tags:”. These docs were written based on the latest available at the time.

    ngc registry image info nvcr.io/nvidia/nim/bionemo_esmfold_nim
    
     1Image Repository Information
     2   Name: bionemo_esmfold_nim
     3   Display Name: NVIDIA NIM for ESMFold
     4   Short Description: ESMFold predicts the 3D structure of a protein from its amino acid sequence.
     5   Built By: NVIDIA
     6   Publisher: META
     7   Multinode Support: False
     8   Multi-Arch Support: False
     9   Logo: https://assets.ngc.nvidia.com/products/catalog/images/esmfold.jpg
    10   Labels: NVIDIA NIM, Drug Discovery, Life Sciences, NVIDIA AI Enterprise Supported, Triton Inference Server
    11   Public: No
    12   Last Updated: May 02, 2024
    13   Latest Image Size: 8.8 GB
    14   Signed Tag?: False
    15   Latest Tag: 24.03.01
    16   Tags:
    17       24.03.01
    
  2. Pull the container image

    docker pull nvcr.io/nvidia/nim/bionemo_esmfold_nim:24.03.01
    
    ngc registry image pull nvcr.io/nvidia/nim/bionemo_esmfold_nim:24.03.01
    

Pull the ESMFold Model

  1. Model tags follow the versioning of repository:version. The model is called bionemo-esmfold and the version follows the naming pattern protein-folding_noarchx1_bf16_YY.MM.x. Additional versions are available and can be seen by running the following NGC command line command:

    ngc registry model list nvidia/nim/bionemo-esmfold:*
    
  2. Download the model.

    ngc registry model download-version nvidia/nim/bionemo-esmfold:protein-folding_noarchx1_bf16_24.03 --dest ${MODEL_DIRECTORY}
    
  3. Download the model weights. There are three model checkpoint downloads.

    1. Download the folding module.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/models/esmfold_3B_v1.pt --output ${WEIGHTS_DIRECTORY}/esmfold_3B_v1.pt
      
    2. Download the ESM2 3B language model.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t36_3B_UR50D.pt --output ${WEIGHTS_DIRECTORY}/esm2_t36_3B_UR50D.pt
      
    3. Download the contact regression model.

      curl -L https://dl.fbaipublicfiles.com/fair-esm/regression/esm2_t36_3B_UR50D-contact-regression.pt --output ${WEIGHTS_DIRECTORY}/esm2_t36_3B_UR50D-contact-regression.pt
      

    Warning

    DISCLAIMER: Each user is responsible for checking the content of models and the applicable licenses and determining if suitable for the intended use.

Launch the Microservice

Launch the ESMFold container. Start-up may take a couple of minutes until the service is available.

Note

In this example, we’re hosting the OpenAI API-compatible endpoint and health check on port 8008. After you start the Docker command below, you may open another terminal session on the same host and proceed to the next step.

 1  # If you have multiple GPUs, you can use `CUDA_VISIBLE_DEVICES` to select the GPU indices in a comma-separated list. E.g to use 4 GPUs, set to CUDA_VISIBLE_DEVICES=0,1,2,3
 2  CUDA_VISIBLE_DEVICES=0 && docker run --rm -it --runtime=nvidia \
 3       --name ${MODEL_NAME} \
 4       -e CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES} \
 5       --shm-size=2G \
 6       --ulimit memlock=-1 --ulimit stack=67108864 \
 7       -v ${MODEL_DIRECTORY}/bionemo-esmfold_vprotein-folding_noarchx1_bf16_24.03:/config/models \
 8       -v ${WEIGHTS_DIRECTORY}:/esm_models \
 9       -e MODEL_PATH=/config/models \
10       -p 8008:8008 \
11       nvcr.io/nvidia/nim/bionemo_esmfold_nim:24.03.01

Health and Liveness Checks

The container exposes a health endpoint for integration into existing systems such as Kubernetes. This endpoint only returns an HTTP 200 OK status code if the service is ready. Run this in a new terminal. Remember, it may take a few minutes to load the models to the GPU and initialize the service completely.

curl localhost:8008/health/ready
...
true

Run Inference

Execute the ESMFold protein structure prediction inference request using CURL.

1curl -X 'POST' \
2'http://localhost:8008/protein-structure/esmfold/predict' \
3-H 'accept: application/json' \
4-H 'Content-Type: application/json' \
5-d '{
6"sequence": "MAGEGDQQDAAHNMGNHLPLLPAESEEEDEMEVEDQDSKEAKKPNIINFDTSLPTSHTYLGADMEEFHGRTLHDDDSCQVIPVLPQVMMILIPGQTLPLQLFHPQEVSMVRNLIQKDRTFAVLAYSNVQEREAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSDGIQQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQCSYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLREWDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQRLRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVNPHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICASHIGWKFTATKKDMSPQKFWGLTRSALLPTIPDTEDEISPDKVILCL"
7}' > output.json

(Optional): The file output.json is a JSON formatted content with a predicted protein structure, which cannot be directly parsed. Use the following command to have a human-readable version:

$ cat output.json | sed 's/\\n/\n/g' | head -n 45

Stopping the Container

When you’re done testing the endpoint, you can bring down the container by running docker stop ${MODEL_NAME} in a new terminal.