Example Requests#

OpenFold2 NIM provides the following endpoint:

  • biology/openfold/openfold2/predict-structure-from-msa-and-template: Perform structure prediction from an input MSA and templates.

Usage#

Below, we give real examples of requests that should run when the NIM is correctly configured.

Note: We do not recommend submitting requests to this endpoint using the tool, curl, as its inputs have characters that require careful escaping in bash. We recommend interacting with this endpoint using the Python request module.

Predict Structure#

The endpoint accepts requests where the data is formatted as in the OpenApI Specification

Here is an example request using the Python requests module.

  • The field sequence is required.

  • The field input_id is optional but recommended. It serves as a unique identifier for tracking requests and correlating responses. Use formats like "7WBN_A" (PDB ID + chain) or custom names like "my_protein_001".

  • The field alignments is optional, but omission can result in inaccurate structures. If included, the MSA content must be in a3m format.

  • The field selected_models is optional, but included here to reduce runtime, since the default specifies 5 models.

  • Use one of the following options if you want to include templates:

    • Set use_templates=True, and populate the field templates with the content of an hhr file, or several hhr files, which contain(s) HHSearch results. With these settings, template structures will be read from the database of structure files contained in the ngc model artifact. Template features will not be used if use_templates=False.

    • Set use_templates=True, and populate the field explicit_templates field with the content of one or more structure files in mmCIF format. Template features will not be used if use_templates=False.

The following workflow is recommended when predicting structures for a given protein sequence.

  1. Use other tools to produce multiple-sequence-alignments, and to put them in the a3m format.

  2. Use other tools, and potentially the MSA results from step 1, to generate one of the following:

    • The template search results in an hhr file

    • The structural templates in mmCIF format that you can supply using the mmcif_templates field

  3. Load the a3m files and hhr or mmcif files, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.

    • structural templates in mmCIF format that you can supply via the explicit_templates field.

  4. Load the a3m files and hhr files, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.

import os
import requests
import json

# --------------------------------
# parameters
# --------------------------------
url = "http://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-template"
headers = {"Content-Type": "application/json"}
sequence = (
    "GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNP"
    "EGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC"
)    
uniref90_alignment_in_a3m_trunc10=\
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>UniRef90_A0A221IUG4
--------------------------QTVKLVKRLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A7KWE0
---------------------------TVRLIKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UE91
---------------------------TVKLIKEIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_D6NY33
---------------------------AVRLIKQIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A221IUJ5
--------------------------QTVKLIKRLYQSNPPPNPEGTRQARRNRRRRWREKQRQ----------------------------------
>UniRef90_D6NYR3
---------------------------TVRLVKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_I6Y2C4
---------------------------TVRLIKRIYQSNPPPNPEGTRQARRNRRRRWRERQRQIQN-------------------------------
>UniRef90_A0A161CVP3
--------------------------QTIRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_Q6EFX9
--------------------------QTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UAR5
--------------------------ETVKIIKYLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
"""
small_bfd_alignment_in_a3m = \
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>A0A076V4A1_9HIV1
------------------------------------QSNPPPNHEGTRQARRNRRRRWRERQRQ----------------------------------
"""

# Example mmCIF template string (very small, minimal but valid)
mmcif_template = """data_demo
#
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
_atom_site.label_comp_id
_atom_site.label_asym_id
ATOM 1 N N ASN A
"""

# --------------------------------
# assemble request content
# --------------------------------
data = { 
    "sequence": sequence,
    "input_id": "example_protein_001",  # Optional: unique identifier for request tracking
    "selected_models": [1, 2],
    "alignments": {
        "uniref90": {
            "a3m": {
                "alignment": uniref90_alignment_in_a3m_trunc10, 
                "format": "a3m",
            }
        },
        "small_bfd": {
            "a3m": {
                "alignment": small_bfd_alignment_in_a3m,
                "format": "a3m",
            }
        }
    },
    "use_templates" : true,
    "explicit_templates": [
        {
            "structure": mmcif_template,
            "format": "mmcif",
            "name": "demo_template",
            "source": "user_provided"
        }
    ],
}

# --------------------------------
# post-to-server
# --------------------------------
response = requests.post(
    url=url, 
    data=json.dumps(data), 
    headers=headers,
    timeout=300,
)
# Check if the request was successful
if response.ok:
    print("Request succeeded:", response.json())
else:
    print("Request failed:", response.status_code, response.text)

The runtime for structure prediction is impacted by both sequence length and the number of sequences in the multiple-sequence-alignment. On an NVIDIA H100 80GB HBM3 device, this example should complete in under 30 seconds.