Example Requests#

OpenFold2 NIM provides the following endpoint:

  • biology/openfold/openfold2/predict-structure-from-msa-and-template: Perform structure prediction from an input MSA and templates.

Note

Starting with version 2.0.0, only mmCIF-based templates are supported. For information on migrating from HHR-based templates, refer to the Migration Guide.

Using the OpenFold2 NIM#

This section provides real examples of requests that should run when the NIM is correctly configured.

Note: We do not recommend submitting requests to this endpoint using the tool, curl, as its inputs have characters that require careful escaping in bash. We recommend interacting with this endpoint using the Python request module.

Predict Structure#

The endpoint accepts requests where the data is formatted as in the OpenApI Specification

Here is an example request using the Python requests module.

  • The field sequence is required.

  • The field input_id is optional but recommended. It serves as a unique identifier for tracking requests and correlating responses. Use formats like "7WBN_A" (PDB ID + chain) or custom names like "my_protein_001".

  • The field alignments is optional, but omission can result in inaccurate structures. If included, the MSA content must be in a3m format.

  • The field selected_models is optional, but included here to reduce runtime, since the default specifies 5 models.

  • To include templates, set use_templates=True and populate the explicit_templates field with the content of one or more structure files in mmCIF format. Template features will not be used if use_templates=False.

The following workflow is recommended when predicting structures for a given protein sequence:

  1. Use MSA tools (e.g., HHblits, MMseqs2, or ColabFold) to produce multiple-sequence-alignments and format them in the a3m format.

  2. Obtain structural templates in mmCIF format from sources such as:

    • PDB database downloads

    • Experimental structures

    • Computational models, such as AlphaFold predictions

  3. Load the a3m files and mmcif files in a Python script, and format the request as specified in the OpenAPI specifications.

For detailed guidance on migrating from HHR-based templates to mmCIF format, refer to the Migration Guide.

import os
import requests
import json

# --------------------------------
# parameters
# --------------------------------
url = "http://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-template"
headers = {"Content-Type": "application/json"}
sequence = (
    "GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNP"
    "EGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC"
)    
uniref90_alignment_in_a3m_trunc10=\
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>UniRef90_A0A221IUG4
--------------------------QTVKLVKRLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A7KWE0
---------------------------TVRLIKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UE91
---------------------------TVKLIKEIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_D6NY33
---------------------------AVRLIKQIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A221IUJ5
--------------------------QTVKLIKRLYQSNPPPNPEGTRQARRNRRRRWREKQRQ----------------------------------
>UniRef90_D6NYR3
---------------------------TVRLVKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_I6Y2C4
---------------------------TVRLIKRIYQSNPPPNPEGTRQARRNRRRRWRERQRQIQN-------------------------------
>UniRef90_A0A161CVP3
--------------------------QTIRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_Q6EFX9
--------------------------QTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UAR5
--------------------------ETVKIIKYLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
"""
small_bfd_alignment_in_a3m = \
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>A0A076V4A1_9HIV1
------------------------------------QSNPPPNHEGTRQARRNRRRRWRERQRQ----------------------------------
"""

# Example mmCIF template string (very small, minimal but valid)
mmcif_template = """data_SIMPLE
#
_entry.id SIMPLE
#
_audit_conform.dict_name       mmcif_pdbx.dic
_audit_conform.dict_version    5.391
_audit_conform.dict_location   http://mmcif.pdb.org/dictionaries/ascii/mmcif_pdbx.dic
#
loop_
_chem_comp.id
_chem_comp.type
GLY 'L-peptide linking'
SER 'L-peptide linking'
LYS 'L-peptide linking'
GLU 'L-peptide linking'
ASN 'L-peptide linking'
ILE 'L-peptide linking'
#
loop_
_entity_poly_seq.entity_id
_entity_poly_seq.num
_entity_poly_seq.mon_id
1 1 GLY
1 2 GLY
1 3 SER
1 4 LYS
1 5 GLU
1 6 ASN
1 7 GLU
1 8 ILE
#
loop_
_struct_asym.id
_struct_asym.entity_id
A 1
#
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
_atom_site.label_alt_id
_atom_site.label_comp_id
_atom_site.label_asym_id
_atom_site.label_entity_id
_atom_site.label_seq_id
_atom_site.pdbx_PDB_ins_code
_atom_site.Cartn_x
_atom_site.Cartn_y
_atom_site.Cartn_z
_atom_site.occupancy
_atom_site.B_iso_or_equiv
_atom_site.pdbx_formal_charge
_atom_site.auth_seq_id
_atom_site.auth_comp_id
_atom_site.auth_asym_id
_atom_site.auth_atom_id
_atom_site.pdbx_PDB_model_num
ATOM 1  N N    . GLY A 1 1 ? 0.000  0.000  0.000  1.00 20.00 ? 1 GLY A N  1
ATOM 2  C CA   . GLY A 1 1 ? 1.458  0.000  0.000  1.00 20.00 ? 1 GLY A CA 1
ATOM 3  C C    . GLY A 1 1 ? 2.009  1.420  0.000  1.00 20.00 ? 1 GLY A C  1
ATOM 4  O O    . GLY A 1 1 ? 1.251  2.389  0.000  1.00 20.00 ? 1 GLY A O  1
ATOM 5  N N    . GLY A 1 2 ? 3.200  1.650  0.000  1.00 20.00 ? 2 GLY A N  1
ATOM 6  C CA   . GLY A 1 2 ? 3.658  3.030  0.000  1.00 20.00 ? 2 GLY A CA 1
ATOM 7  C C    . GLY A 1 2 ? 5.116  3.260  0.000  1.00 20.00 ? 2 GLY A C  1
ATOM 8  O O    . GLY A 1 2 ? 5.874  2.291  0.000  1.00 20.00 ? 2 GLY A O  1
ATOM 9  N N    . SER A 1 3 ? 5.574  4.640  0.000  1.00 20.00 ? 3 SER A N  1
ATOM 10 C CA   . SER A 1 3 ? 7.032  4.870  0.000  1.00 20.00 ? 3 SER A CA 1
ATOM 11 C C    . SER A 1 3 ? 7.490  6.250  0.000  1.00 20.00 ? 3 SER A C  1
ATOM 12 O O    . SER A 1 3 ? 6.732  7.219  0.000  1.00 20.00 ? 3 SER A O  1
ATOM 13 C CB   . SER A 1 3 ? 7.532  4.370  1.200  1.00 20.00 ? 3 SER A CB 1
ATOM 14 O OG   . SER A 1 3 ? 8.932  4.600  1.200  1.00 20.00 ? 3 SER A OG 1
ATOM 15 N N    . LYS A 1 4 ? 8.681  6.480  0.000  1.00 20.00 ? 4 LYS A N  1
ATOM 16 C CA   . LYS A 1 4 ? 9.139  7.860  0.000  1.00 20.00 ? 4 LYS A CA 1
ATOM 17 C C    . LYS A 1 4 ? 10.597 8.090  0.000  1.00 20.00 ? 4 LYS A C  1
ATOM 18 O O    . LYS A 1 4 ? 11.355 7.121  0.000  1.00 20.00 ? 4 LYS A O  1
ATOM 19 C CB   . LYS A 1 4 ? 8.639  8.360  1.200  1.00 20.00 ? 4 LYS A CB 1
ATOM 20 C CG   . LYS A 1 4 ? 9.139  9.740  1.200  1.00 20.00 ? 4 LYS A CG 1
ATOM 21 C CD   . LYS A 1 4 ? 8.639 10.240  2.400  1.00 20.00 ? 4 LYS A CD 1
ATOM 22 C CE   . LYS A 1 4 ? 9.139 11.620  2.400  1.00 20.00 ? 4 LYS A CE 1
ATOM 23 N NZ   . LYS A 1 4 ? 8.639 12.120  3.600  1.00 20.00 ? 4 LYS A NZ 1
ATOM 24 N N    . GLU A 1 5 ? 11.055 9.470  0.000  1.00 20.00 ? 5 GLU A N  1
ATOM 25 C CA   . GLU A 1 5 ? 12.513 9.700  0.000  1.00 20.00 ? 5 GLU A CA 1
ATOM 26 C C    . GLU A 1 5 ? 12.971 11.080 0.000  1.00 20.00 ? 5 GLU A C  1
ATOM 27 O O    . GLU A 1 5 ? 12.213 12.049 0.000  1.00 20.00 ? 5 GLU A O  1
ATOM 28 C CB   . GLU A 1 5 ? 13.013 9.200  1.200  1.00 20.00 ? 5 GLU A CB 1
ATOM 29 C CG   . GLU A 1 5 ? 14.471 9.430  1.200  1.00 20.00 ? 5 GLU A CG 1
ATOM 30 C CD   . GLU A 1 5 ? 14.929 10.810 1.200  1.00 20.00 ? 5 GLU A CD 1
ATOM 31 O OE1  . GLU A 1 5 ? 14.171 11.779 1.200  1.00 20.00 ? 5 GLU A OE1 1
ATOM 32 O OE2  . GLU A 1 5 ? 16.129 10.940 1.200  1.00 20.00 ? 5 GLU A OE2 1
ATOM 33 N N    . ASN A 1 6 ? 14.162 11.310 0.000  1.00 20.00 ? 6 ASN A N  1
ATOM 34 C CA   . ASN A 1 6 ? 14.620 12.690 0.000  1.00 20.00 ? 6 ASN A CA 1
ATOM 35 C C    . ASN A 1 6 ? 16.078 12.920 0.000  1.00 20.00 ? 6 ASN A C  1
ATOM 36 O O    . ASN A 1 6 ? 16.836 11.951 0.000  1.00 20.00 ? 6 ASN A O  1
ATOM 37 C CB   . ASN A 1 6 ? 14.120 13.190 1.200  1.00 20.00 ? 6 ASN A CB 1
ATOM 38 C CG   . ASN A 1 6 ? 14.578 14.570 1.200  1.00 20.00 ? 6 ASN A CG 1
ATOM 39 O OD1  . ASN A 1 6 ? 15.736 14.800 1.200  1.00 20.00 ? 6 ASN A OD1 1
ATOM 40 N ND2  . ASN A 1 6 ? 13.820 15.539 1.200  1.00 20.00 ? 6 ASN A ND2 1
ATOM 41 N N    . GLU A 1 7 ? 16.536 14.300 0.000  1.00 20.00 ? 7 GLU A N  1
ATOM 42 C CA   . GLU A 1 7 ? 17.994 14.530 0.000  1.00 20.00 ? 7 GLU A CA 1
ATOM 43 C C    . GLU A 1 7 ? 18.452 15.910 0.000  1.00 20.00 ? 7 GLU A C  1
ATOM 44 O O    . GLU A 1 7 ? 17.694 16.879 0.000  1.00 20.00 ? 7 GLU A O  1
ATOM 45 N N    . ILE A 1 8 ? 19.643 16.140 0.000  1.00 20.00 ? 8 ILE A N  1
ATOM 46 C CA   . ILE A 1 8 ? 20.101 17.520 0.000  1.00 20.00 ? 8 ILE A CA 1
ATOM 47 C C    . ILE A 1 8 ? 21.559 17.750 0.000  1.00 20.00 ? 8 ILE A C  1
ATOM 48 O O    . ILE A 1 8 ? 22.317 16.781 0.000  1.00 20.00 ? 8 ILE A O  1
ATOM 49 C CB   . ILE A 1 8 ? 19.601 18.020 1.200  1.00 20.00 ? 8 ILE A CB 1
ATOM 50 C CG1  . ILE A 1 8 ? 18.101 17.790 1.200  1.00 20.00 ? 8 ILE A CG1 1
ATOM 51 C CG2  . ILE A 1 8 ? 20.101 19.400 1.200  1.00 20.00 ? 8 ILE A CG2 1
ATOM 52 C CD1  . ILE A 1 8 ? 17.601 18.290 2.400  1.00 20.00 ? 8 ILE A CD1 1
#
"""

# --------------------------------
# assemble request content
# --------------------------------
data = { 
    "sequence": sequence,
    "input_id": "example_protein_001",  # Optional: unique identifier for request tracking
    "selected_models": [1, 2],
    "alignments": {
        "uniref90": {
            "a3m": {
                "alignment": uniref90_alignment_in_a3m_trunc10, 
                "format": "a3m",
            }
        },
        "small_bfd": {
            "a3m": {
                "alignment": small_bfd_alignment_in_a3m,
                "format": "a3m",
            }
        }
    },
    "use_templates": True,
    "explicit_templates": [
        {
            "structure": mmcif_template,
            "format": "mmcif",
            "name": "demo_template",
            "source": "user_provided"
        }
    ],
}

# --------------------------------
# post-to-server
# --------------------------------
response = requests.post(
    url=url, 
    data=json.dumps(data), 
    headers=headers,
    timeout=300,
)
# Check if the request was successful
if response.ok:
    print("Request succeeded:", response.json())
else:
    print("Request failed:", response.status_code, response.text)

The runtime for structure prediction is impacted by both sequence length and the number of sequences in the multiple-sequence-alignment. On an NVIDIA H100 80GB HBM3 device, this example should complete in under 30 seconds.