Example Requests#
OpenFold2 NIM provides the following endpoint:
biology/openfold/openfold2/predict-structure-from-msa-and-template
: Perform structure prediction from an input MSA and templates.
Usage#
Below, we give real examples of requests that should run when the NIM is correctly configured.
Note: We do not recommend submitting requests to this endpoint using the tool, curl, as its inputs have characters that require careful escaping in bash
. We recommend interacting with this endpoint using the Python request module.
Predict Structure#
The endpoint accepts requests where the data is formatted as in the OpenApI Specification
Here is an example request using the Python requests
module.
The field
sequence
is required.The field
input_id
is optional but recommended. It serves as a unique identifier for tracking requests and correlating responses. Use formats like"7WBN_A"
(PDB ID + chain) or custom names like"my_protein_001"
.The field
alignments
is optional, but omission can result in inaccurate structures. If included, the MSA content must be ina3m
format.The field
selected_models
is optional, but included here to reduce runtime, since the default specifies 5 models.Use one of the following options if you want to include templates:
Set
use_templates
=True
, and populate the fieldtemplates
with the content of anhhr
file, or severalhhr
files, which contain(s) HHSearch results. With these settings, template structures will be read from the database of structure files contained in the ngc model artifact. Template features will not be used ifuse_templates
=False
.Set
use_templates
=True
, and populate the fieldexplicit_templates
field with the content of one or more structure files in mmCIF format. Template features will not be used ifuse_templates
=False
.
The following workflow is recommended when predicting structures for a given protein sequence.
Use other tools to produce multiple-sequence-alignments, and to put them in the
a3m
format.Use other tools, and potentially the MSA results from step 1, to generate one of the following:
The template search results in an
hhr
fileThe structural templates in mmCIF format that you can supply using the
mmcif_templates
field
Load the
a3m
files andhhr
ormmcif
files, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.structural templates in mmCIF format that you can supply via the
explicit_templates
field.
Load the
a3m
files andhhr
files, in a Python script, and format the request as specified in the Open API specifications, included at the end of this documentation.
import os
import requests
import json
# --------------------------------
# parameters
# --------------------------------
url = "http://localhost:8000/biology/openfold/openfold2/predict-structure-from-msa-and-template"
headers = {"Content-Type": "application/json"}
sequence = (
"GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNP"
"EGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC"
)
uniref90_alignment_in_a3m_trunc10=\
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>UniRef90_A0A221IUG4
--------------------------QTVKLVKRLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A7KWE0
---------------------------TVRLIKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UE91
---------------------------TVKLIKEIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_D6NY33
---------------------------AVRLIKQIYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A221IUJ5
--------------------------QTVKLIKRLYQSNPPPNPEGTRQARRNRRRRWREKQRQ----------------------------------
>UniRef90_D6NYR3
---------------------------TVRLVKQLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_I6Y2C4
---------------------------TVRLIKRIYQSNPPPNPEGTRQARRNRRRRWRERQRQIQN-------------------------------
>UniRef90_A0A161CVP3
--------------------------QTIRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_Q6EFX9
--------------------------QTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
>UniRef90_A0A2I6UAR5
--------------------------ETVKIIKYLYQSNPPPNPEGTRQARRNRRRRWRERQRQ----------------------------------
"""
small_bfd_alignment_in_a3m = \
""">BQXYMDHSRWGGVPIWVK
GGSKENEISHHAKEIERLQKEIERHKQSIKKLKQSEQSNPPPNPEGTRQARRNRRRRWRERQRQKENEISHHAKEIERLQKEIERHKQSIKKLKQSEC
>A0A076V4A1_9HIV1
------------------------------------QSNPPPNHEGTRQARRNRRRRWRERQRQ----------------------------------
"""
# Example mmCIF template string (very small, minimal but valid)
mmcif_template = """data_demo
#
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
_atom_site.label_comp_id
_atom_site.label_asym_id
ATOM 1 N N ASN A
"""
# --------------------------------
# assemble request content
# --------------------------------
data = {
"sequence": sequence,
"input_id": "example_protein_001", # Optional: unique identifier for request tracking
"selected_models": [1, 2],
"alignments": {
"uniref90": {
"a3m": {
"alignment": uniref90_alignment_in_a3m_trunc10,
"format": "a3m",
}
},
"small_bfd": {
"a3m": {
"alignment": small_bfd_alignment_in_a3m,
"format": "a3m",
}
}
},
"use_templates" : true,
"explicit_templates": [
{
"structure": mmcif_template,
"format": "mmcif",
"name": "demo_template",
"source": "user_provided"
}
],
}
# --------------------------------
# post-to-server
# --------------------------------
response = requests.post(
url=url,
data=json.dumps(data),
headers=headers,
timeout=300,
)
# Check if the request was successful
if response.ok:
print("Request succeeded:", response.json())
else:
print("Request failed:", response.status_code, response.text)
The runtime for structure prediction is impacted by both sequence length
and the number of sequences in the multiple-sequence-alignment. On an NVIDIA H100 80GB HBM3
device, this example should complete in under 30 seconds.