NVIDIA NIM currently is in limited availability, sign up here to get notified when the latest NIMs are available to download.

ESMFold is a protein structure prediction deep learning model used for drug discovery. ESMFold uses Meta AI’s ESM2 protein language model and can estimate the 3D structure of a protein based on a single amino acid sequence, without requiring examples of several similar sequences.

ESMfold model provides:

  • Ultrafast, 3D protein structure prediction based on ESM-2 embeddings without multiple sequence alignment (MSA).

  • Significantly faster inference times for protein structure prediction are nearly as accurate as alignment-based methods.

  • Learns protein sequences through unsupervised pretraining and can predict the structure of a single protein sequence without requiring many homologous sequences as input.


Example output.


A more detailed description of the model can be found in the Model Card.

More information can be found at NGC Collections.

Model Specific Requirements

The following are specific requirements for ESMFold NIM.


Please refer to NVIDIA NIM documentation for necessary hardware, operating system, and software prerequisites if you have not done so already


  • A100 80 GB

  • A6000 48 GB


Expected to be broadly compatible with all NVIDIA GPUs with at least 24 GB of memory.


  • Minimum Driver version: 535.104.05

Once the above requirements have been met, you will use the Quickstart Guide to pull the NIM container and model, perform a health check and then run inference.

Quickstart Guide

The following QuickStart guide is provided to quickly get ESMFold NIM up and running. You will download the ESMFold model from NGC and then the model weights from Facebook AI Research (FAIR).

Please refer to the detailed instruction section for additional information if needed.


This page assumes Prerequisite Software (Docker, NGC CLI, NGC registry access) is installed and set up.

ESMFold requires the download of a Triton model repository from NGC followed by the model weights from FAIR.


DISCLAIMER: Each user is responsible for checking the content of models and the applicable licenses and determining if suitable for the intended use.

  1. Pull the ESFold NIM container.

    docker pull
  2. Pull the ESMFold model from NGC.

    1mkdir -p ~/esmfold-nim/{weights,models}
    2ngc registry model download-version nvidia/nim/bionemo-esmfold:protein-folding_noarchx1_bf16_24.03 --dest ~/esmfold-nim/models/
  3. Download the three model weights files.

    1. Download the folding module weights.

      curl -L --output ~/esmfold-nim/weights/
    2. Download the ESM2 3B language model weights.

      curl -L --output ~/esmfold-nim/weights/
    3. Download the contact regression model weights.

      curl -L --output ~/esmfold-nim/weights/
  4. Run the NIM container.

    1CUDA_VISIBLE_DEVICES=0 && docker run --rm -it --runtime=nvidia \
    3--shm-size=2G --name bionemo-esmfold \
    4--ulimit memlock=-1 --ulimit stack=67108864 \
    5-v ~/esmfold-nim/models/bionemo-esmfold_vprotein-folding_noarchx1_bf16_24.03/:/config/models \
    6-v ~/esmfold-nim/weights/:/esm_models \
    7-e MODEL_PATH=/config/models \
    8-p 8008:8008 \
  5. Wait until the HTTP health check returns true before proceeding. This may take a couple of minutes.

    1curl localhost:8008/health/ready
  6. Request Inference from the local NIM instance.

    1curl -X 'POST' \
    2   'http://localhost:8008/protein-structure/esmfold/predict' \
    3   -H 'accept: application/json' \
    4   -H 'Content-Type: application/json' \
    5   -d '{
    7   }'

Available Models



Artifact Size


Memory Footprint


Triton Model Repository

1.06 MB


11 GB

Model Weights

2.6 GB



Model Weights

5.3 GB



Model Weights

6.7 KB




If BF16 is not supported on the GPU, the model will load in FP32 precision.

Detailed Instructions

This section provides additional details outside of the scope of the Quickstart Guide.

First, we will define bash variables that we will reuse throughout these instructions:

4mkdir -p ${MODEL_DIRECTORY}

Pull Container Image

  1. Container image tags follow the versioning of YY.MM, similar to other container images on NGC. You may see different values under “Tags:”. These docs were written based on the latest available at the time.

    ngc registry image info
     1Image Repository Information
     2   Name: bionemo_esmfold_nim
     3   Display Name: NVIDIA NIM for ESMFold
     4   Short Description: ESMFold predicts the 3D structure of a protein from its amino acid sequence.
     5   Built By: NVIDIA
     6   Publisher: META
     7   Multinode Support: False
     8   Multi-Arch Support: False
     9   Logo:
    10   Labels: NVIDIA NIM, Drug Discovery, Life Sciences, NVIDIA AI Enterprise Supported, Triton Inference Server
    11   Public: No
    12   Last Updated: May 02, 2024
    13   Latest Image Size: 8.8 GB
    14   Signed Tag?: False
    15   Latest Tag: 24.03.01
    16   Tags:
    17       24.03.01
  2. Pull the container image

    docker pull
    ngc registry image pull

Pull the ESMFold Model

  1. Model tags follow the versioning of repository:version. The model is called bionemo-esmfold and the version follows the naming pattern protein-folding_noarchx1_bf16_YY.MM.x. Additional versions are available and can be seen by running the following NGC command line command:

    ngc registry model list nvidia/nim/bionemo-esmfold:*
  2. Download the model.

    ngc registry model download-version nvidia/nim/bionemo-esmfold:protein-folding_noarchx1_bf16_24.03 --dest ${MODEL_DIRECTORY}
  3. Download the model weights. There are three model checkpoint downloads.

    1. Download the folding module.

      curl -L --output ${WEIGHTS_DIRECTORY}/
    2. Download the ESM2 3B language model.

      curl -L --output ${WEIGHTS_DIRECTORY}/
    3. Download the contact regression model.

      curl -L --output ${WEIGHTS_DIRECTORY}/


    DISCLAIMER: Each user is responsible for checking the content of models and the applicable licenses and determining if suitable for the intended use.

Launch the Microservice

Launch the ESMFold container. Start-up may take a couple of minutes until the service is available.


In this example, we’re hosting the OpenAI API-compatible endpoint and health check on port 8008. After you start the Docker command below, you may open another terminal session on the same host and proceed to the next step.

 1  # If you have multiple GPUs, you can use `CUDA_VISIBLE_DEVICES` to select the GPU indices in a comma-separated list. E.g to use 4 GPUs, set to CUDA_VISIBLE_DEVICES=0,1,2,3
 2  CUDA_VISIBLE_DEVICES=0 && docker run --rm -it --runtime=nvidia \
 3       --name ${MODEL_NAME} \
 5       --shm-size=2G \
 6       --ulimit memlock=-1 --ulimit stack=67108864 \
 7       -v ${MODEL_DIRECTORY}/bionemo-esmfold_vprotein-folding_noarchx1_bf16_24.03:/config/models \
 8       -v ${WEIGHTS_DIRECTORY}:/esm_models \
 9       -e MODEL_PATH=/config/models \
10       -p 8008:8008 \

Health and Liveness Checks

The container exposes a health endpoint for integration into existing systems such as Kubernetes. This endpoint only returns an HTTP 200 OK status code if the service is ready. Run this in a new terminal. Remember, it may take a few minutes to load the models to the GPU and initialize the service completely.

curl localhost:8008/health/ready

Run Inference

Execute the ESMFold protein structure prediction inference request using CURL.

1curl -X 'POST' \
2'http://localhost:8008/protein-structure/esmfold/predict' \
3-H 'accept: application/json' \
4-H 'Content-Type: application/json' \
5-d '{
7}' > output.json

(Optional): The file output.json is a JSON formatted content with a predicted protein structure, which cannot be directly parsed. Use the following command to have a human-readable version:

$ cat output.json | sed 's/\\n/\n/g' | head -n 45

Stopping the Container

When you’re done testing the endpoint, you can bring down the container by running docker stop ${MODEL_NAME} in a new terminal.