Example Requests#
OpenFold2 NIM provides the following endpoint:
biology/openfold/openfold2/predict-structure-from-msa-and-template: Perform structure prediction from an input MSA and templates.
Usage#
Below, we give real examples of requests that should run when the NIM is correctly configured.
Note: We do not recommend submitting requests to this endpoint using the tool, curl, as its inputs have characters that require careful escaping in bash. We recommend interacting with this endpoint using the Python request module.
Predict Structure#
The endpoint accepts requests where the data is formatted as in the OpenApI Specification
Here is an example request using the Python requests module.
The field
sequenceis required.The field
input_idis optional but recommended. It serves as a unique identifier for tracking requests and correlating responses. Use formats like"7WBN_A"(PDB ID + chain) or custom names like"my_protein_001".The field
alignmentsis optional, but omission can result in inaccurate structures. If included, the MSA content must be ina3mformat.The field
selected_modelsis optional, but included here to reduce runtime, since the default specifies 5 models.Use one of the following options if you want to include templates:
Set
use_templates=True, and populate the fieldtemplateswith the content of anhhrfile, or severalhhrfiles, which contain(s) HHSearch results. With these settings, template structures will be read from the database of structure files contained in the ngc model artifact. Template features will not be used ifuse_templates=False.Set
use_templates=True, and populate the fieldexplicit_templatesfield with the content of one or more structure files in mmCIF format. Template features will not be used ifuse_templates=False.
The following workflow is recommended when predicting structures for a given protein sequence.
Use other tools to produce multiple-sequence-alignments, and to put them in the
a3mformat.Use other tools, and potentially the MSA results from step 1, to generate one of the following:
The template search results in an
hhrfileThe structural templates in mmCIF format that you can supply using the
mmcif_templatesfield
Load the
a3mfiles andhhrormmciffiles, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.structural templates in mmCIF format that you can supply via the
explicit_templatesfield.
Load the
a3mfiles andhhrfiles, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.
import os
import requests
import json
# --------------------------------
# parameters
# --------------------------------
url = "http://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-template"
headers = {"Content-Type": "application/json"}
sequence = (
"GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNP"
"EGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC"
)
uniref90_alignment_in_a3m_trunc10=\
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>UniRef90_A0A221IUG4
--------------------------QTVKLVKRLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A7KWE0
---------------------------TVRLIKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UE91
---------------------------TVKLIKEIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_D6NY33
---------------------------AVRLIKQIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A221IUJ5
--------------------------QTVKLIKRLYQSNPPPNPEGTRQARRNRRRRWREKQRQ----------------------------------
>UniRef90_D6NYR3
---------------------------TVRLVKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_I6Y2C4
---------------------------TVRLIKRIYQSNPPPNPEGTRQARRNRRRRWRERQRQIQN-------------------------------
>UniRef90_A0A161CVP3
--------------------------QTIRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_Q6EFX9
--------------------------QTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UAR5
--------------------------ETVKIIKYLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
"""
small_bfd_alignment_in_a3m = \
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>A0A076V4A1_9HIV1
------------------------------------QSNPPPNHEGTRQARRNRRRRWRERQRQ----------------------------------
"""
# Example mmCIF template string (very small, minimal but valid)
mmcif_template = """data_demo
#
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
_atom_site.label_comp_id
_atom_site.label_asym_id
ATOM 1 N N ASN A
"""
# --------------------------------
# assemble request content
# --------------------------------
data = {
"sequence": sequence,
"input_id": "example_protein_001", # Optional: unique identifier for request tracking
"selected_models": [1, 2],
"alignments": {
"uniref90": {
"a3m": {
"alignment": uniref90_alignment_in_a3m_trunc10,
"format": "a3m",
}
},
"small_bfd": {
"a3m": {
"alignment": small_bfd_alignment_in_a3m,
"format": "a3m",
}
}
},
"use_templates" : true,
"explicit_templates": [
{
"structure": mmcif_template,
"format": "mmcif",
"name": "demo_template",
"source": "user_provided"
}
],
}
# --------------------------------
# post-to-server
# --------------------------------
response = requests.post(
url=url,
data=json.dumps(data),
headers=headers,
timeout=300,
)
# Check if the request was successful
if response.ok:
print("Request succeeded:", response.json())
else:
print("Request failed:", response.status_code, response.text)
The runtime for structure prediction is impacted by both sequence length
and the number of sequences in the multiple-sequence-alignment. On an NVIDIA H100 80GB HBM3
device, this example should complete in under 30 seconds.