Boltz-2 NIM API Reference#

This documentation contains the API reference for the Boltz-2 NIM.

OpenAPI Specification#

You can download the complete OpenAPI Specification for the Boltz-2 NIM here.

Predict a Protein Structure#

Endpoint path: /biology/mit/boltz2/predict

Request type: post

Input Parameters#

  • polymers (list[Polymer]): Required. A list of polymers (DNA, RNA, or Protein). Minimum 1, maximum 5 polymers allowed. Each polymer contains:

    • id (string): Optional. Unique identifier for the polymer chain. Can be either a single letter (A-Z) or a PDB-style ID (4 alphanumeric characters).

    • molecule_type (string): Required. Type of molecule - “dna”, “rna”, or “protein”.

    • sequence (string): Required. The sequence of the polymer. Must be 1-4,096 characters and contain only valid characters for the molecule type.

    • cyclic (boolean): Optional (default: false). Whether the polymer forms a cyclic structure.

    • msa (dictionary): Optional. Multiple Sequence Alignments for protein molecules only. This nested dictionary contains a mapping of database names to alignment formats and records (for example {"database_1" : {"A3M" : {"format" : "A3M", "alignment" : ">1\nMVNIDIAIAMAI", "rank" : 0}}}).

    • modifications (list[Modification]): Optional. Chemical modifications to specific residues in the sequence. Each modification contains:

      • ccd (string): Required. Chemical Component Dictionary (CCD) ID of the modification (1-3 alphanumeric characters).

      • position (int): Required. The 1-based index of the residue to modify.

  • ligands (list[Ligand]): Optional (default: empty list). A list of ligands. Maximum 5 ligands allowed. Each ligand contains:

    • id (string): Optional. A chain ID for the ligand.

    • ccd (string): Optional. Chemical Component Dictionary (CCD) code for the ligand (1-3 alphanumeric characters).

    • smiles (string): Optional. SMILES string representation of the ligand. Either CCD or SMILES must be provided, but not both.

  • constraints (list[Union[Pocket, Bond]]): Optional (default: empty list). Optional constraints for the prediction. Can be either:

    • Bond Constraint: Specifies atomic bonds between molecules

      • constraint_type: “bond”

      • atoms (list[Atom]): List of atoms involved in the bond. Each atom contains:

        • id (string): Optional. Chain identifier for the atom.

        • residue_index (int): Required. Index of the residue containing the atom.

        • atom_name (string): Required. Name of the specific atom.

    • Pocket Constraint: Defines binding site interactions

      • constraint_type: “pocket”

      • binder (string): Required. The ID of the binding molecule.

      • contacts (list[Contact]): Required. List of contacts defining the pocket. Each contact contains:

        • id (string): Optional. Chain identifier for the contact.

        • residue_index (int): Required. Index of the residue in the binding site.

  • recycling_steps (int): Optional (default: 3). The number of recycling steps to use for prediction. Range: 1-6.

  • sampling_steps (int): Optional (default: 50). The number of sampling steps to use for prediction. Range: 10-1,000.

  • diffusion_samples (int): Optional (default: 1). The number of diffusion samples to use for prediction. Range: 1-5.

  • step_scale (float): Optional (default: 1.638). The step size is related to the temperature at which the diffusion process samples the distribution. Lower values increase diversity among samples (recommended between 1 and 2). Range: 0.5-5.0.

  • without_potentials (boolean): Optional (default: false). Returns the results without potentials.

  • output_format (string): Optional (default: “mmcif”). The output format of the returned structure. Currently only “mmcif” is supported.

  • concatenate_msas (boolean): Optional (default: false). Concatenate Multiple Sequence Alignments for a polymer into one alignment.

Outputs#

  • structures (list[Structure]): The predicted protein structures. Each structure contains:

    • structure (string): The contents of a single structural prediction in the specified format.

    • format (string): The format of the structure record (currently “mmcif”).

    • name (string): Optional name for the structure.

    • source (string): Optional source file for the structure.

  • confidence_scores (list[float]): Confidence scores for each predicted structure.

  • metrics (dictionary): Runtime metrics for the request, useful for debugging and measuring performance.

Example#

#!/bin/bash

# Create JSON payload for protein structure prediction
JSON='{
"polymers": [
    {
    "id": "A",
    "molecule_type": "protein",
    "sequence": "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN"
    }
],
"recycling_steps": 3,
"sampling_steps": 50,
"diffusion_samples": 1,
"step_scale": 1.638,
"output_format": "mmcif"
}'

# Make request
echo "Making request..."
curl -s -X POST \
-H "Content-Type: application/json" \
-d "$JSON" \
http://localhost:8000/biology/mit/boltz2/predict
import requests
import json


if __name__ == "__main__":
    sequence = "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN"  # Replace with your sequence value of interest
    headers = {
    "content-type": "application/json"
    }
    data = {
    "polymers": [
        {
        "id": "A",
        "molecule_type": "protein",
        "sequence": sequence
        }
    ],
    "recycling_steps": 3,
    "sampling_steps": 50,
    "diffusion_samples": 1,
    "step_scale": 1.638,
    "output_format": "mmcif"
    }
    print("Making request...")
    response = requests.post("http://localhost:8000/biology/mit/boltz2/predict", headers=headers, data=json.dumps(data))
    result = response.json()
    print("Structure prediction completed")
    # Access the first predicted structure
    if result.get("structures"):
        structure = result["structures"][0]
        print(f"Structure format: {structure['format']}")
        print(f"Confidence score: {result['confidence_scores'][0]}")

Predict a Protein-Ligand Complex#

Example With Ligand#

#!/bin/bash

# Create JSON payload for protein-ligand complex prediction
JSON='{
"polymers": [
    {
    "id": "A",
    "molecule_type": "protein",
    "sequence": "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN"
    }
],
"ligands": [
    {
    "id": "B1",
    "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"
    }
],
"recycling_steps": 3,
"sampling_steps": 50,
"output_format": "mmcif"
}'

# Make request
echo "Making request..."
curl -s -X POST \
-H "Content-Type: application/json" \
-d "$JSON" \
http://localhost:8000/biology/mit/boltz2/predict
import requests
import json


if __name__ == "__main__":
    sequence = "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN"
    headers = {
    "content-type": "application/json"
    }
    data = {
    "polymers": [
        {
        "id": "A",
        "molecule_type": "protein",
        "sequence": sequence
        }
    ],
    "ligands": [
        {
        "id": "B1",
        "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"  # Aspirin
        }
    ],
    "recycling_steps": 3,
    "sampling_steps": 50,
    "output_format": "mmcif"
    }
    print("Making request...")
    response = requests.post("http://localhost:8000/biology/mit/boltz2/predict", headers=headers, data=json.dumps(data))
    result = response.json()
    print("Protein-ligand complex prediction completed")

Comprehensive Example with All Features#

Example Showing All Possible Input Parameters#

#!/bin/bash

# Comprehensive example with all possible fields
JSON='{
"polymers": [
    {
    "id": "A",
    "molecule_type": "protein",
    "sequence": "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN",
    "cyclic": false,
    "modifications": [
        {
        "ccd": "SEP",
        "position": 15
        }
    ]
    }
],
"ligands": [
    {
    "id": "L1",
    "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"
    }
],
"constraints": [
    {
    "constraint_type": "pocket",
    "binder": "L1",
    "contacts": [
        {
        "id": "A",
        "residue_index": 25
        }
    ]
    }
],
"recycling_steps": 3,
"sampling_steps": 50,
"diffusion_samples": 1,
"step_scale": 1.638,
"without_potentials": false,
"output_format": "mmcif",
"concatenate_msas": false
}'

echo "Making comprehensive prediction request..."
curl -s -X POST \
-H "Content-Type: application/json" \
-d "$JSON" \
http://localhost:8000/biology/mit/boltz2/predict | jq '.'
import requests
import json


if __name__ == "__main__":
    # Comprehensive example with all possible fields
    headers = {
        "content-type": "application/json"
    }
    data = {
        # Required: At least one polymer
        "polymers": [
            {
                "id": "A",                          # Chain identifier
                "molecule_type": "protein",         # DNA, RNA, or protein
                "sequence": "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN",
                "cyclic": False,                    # Whether polymer is cyclic
                "modifications": [                  # Chemical modifications
                    {
                        "ccd": "SEP",               # CCD code for modification
                        "position": 15              # 1-based position to modify
                    }
                ]
                # Note: msa field would go here for proteins with MSA data
            },
            {
                "id": "B",
                "molecule_type": "dna",
                "sequence": "ATCGATCGATCG",
                "cyclic": False
            }
        ],

        # Optional: Ligands
        "ligands": [
            {
                "id": "L1",
                "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O"  # Using SMILES
            },
            {
                "id": "L2",
                "ccd": "ATP"                        # Using CCD code instead
            }
        ],

        # Optional: Constraints
        "constraints": [
            {
                "constraint_type": "pocket",
                "binder": "L1",
                "contacts": [
                    {
                        "id": "A",
                        "residue_index": 25
                    }
                ]
            }
        ],

        # Optional: Prediction parameters (all with defaults shown)
        "recycling_steps": 3,                      # 1-6, controls model iterations
        "sampling_steps": 50,                      # 10-1000, diffusion sampling steps
        "diffusion_samples": 1,                    # 1-5, number of samples to generate
        "step_scale": 1.638,                       # 0.5-5.0, controls sampling temperature
        "without_potentials": False,               # Whether to include potentials
        "output_format": "mmcif",                  # Output format (currently only mmcif)
        "concatenate_msas": False                  # Whether to concatenate MSAs
    }

    print("Making comprehensive prediction request...")
    response = requests.post("http://localhost:8000/biology/mit/boltz2/predict",
                           headers=headers, data=json.dumps(data))

    if response.status_code == 200:
        result = response.json()
        print(f"Prediction completed successfully!")
        print(f"Number of structures returned: {len(result['structures'])}")
        print(f"Confidence scores: {result['confidence_scores']}")
        print(f"Runtime metrics: {result.get('metrics', {})}")

        # Access first structure
        if result['structures']:
            structure = result['structures'][0]
            print(f"Structure format: {structure['format']}")
            print(f"Structure length: {len(structure['structure'])} characters")
    else:
        print(f"Request failed with status {response.status_code}")
        print(response.text)

Check Readiness#

Endpoint path: /v1/health/ready

Input parameters#

None.

Outputs#

The output of the endpoint is a JSON response with a value that indicates the readiness of the microservice. When the NIM is ready, it returns the response 200.

Example#

#!/bin/bash
URL=${NIM_URL:-"http://localhost:8000/v1/health/ready"}
curl -s -w "\nStatus code: %{http_code}\n" -H "Content-Type: application/json" $URL
import requests
import os

if __name__ == "__main__":
    url = os.environ.get("NIM_URL", "http://localhost:8000/v1/health/ready")
    headers = {
        "content-type": "application/json"
    }
    try:
        response = requests.get(url, headers=headers)
        print(f"NIM readiness check returned {response.status_code}")
        assert response.status_code == 200, f"Unexpected status code: {response.status_code}"
    except Exception as e:
        print(f"Health query failed: {e}")