Getting Started#

Prerequisites#

Before you begin, ensure you have completed the setup described in the Prerequisites and Support Matrix, including:

  • Compatible hardware and software

  • NGC account and API key

  • Docker authentication with NGC

  • Python 3 with the requests module (for running examples)

Starting the NIM Container#

  1. Ensure you have logged in to Docker and set your NGC_API_KEY environment variable as described in the Prerequisites.

  2. Create a local cache directory for the NIM.

export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
sudo chmod 0777 -R "$LOCAL_NIM_CACHE"
  1. Start the NIM container:

docker run -it --rm \
  --runtime=nvidia \
  --gpus all \
  -e NGC_API_KEY \
  -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
  -p 8000:8000 \
  nvcr.io/nim/colabfold/msa-search:2

Note

The first time you start the container, it will download approximately 1.4 TB of database files to the local cache. This process may take several hours depending on your internet connection speed. Subsequent starts will use the cached data.

  1. Confirm the service is ready to respond to inference requests:

curl http://localhost:8000/v1/health/ready

MSA Search Examples#

The following examples demonstrate how to use the MSA Search NIM to perform multiple sequence alignment on protein sequences.

For comprehensive descriptions of all available API parameters, see the API Reference. For information on assessing model performance, see Performance.

Python Client Example#

The following example shows how to search for similar sequences and generate a multiple sequence alignment.

  1. Save the following Python example to a file named nim_client.py:

#!/usr/bin/env python3

import requests
import json

url = "http://localhost:8000/biology/colabfold/msa-search/predict"

r = requests.post(
    json={
        "sequence": "SGSMKTAISLPDETFDRVSRRASELGMSRSEFFTKAAQR",
        "e_value": 0.0001,
        "iterations": 1,
        "output_alignment_formats": ["a3m", "fasta"],
    },
    url=url,
)

print(r.text[:100], "...")
r = r.json()
print("Response keys:", list(r.keys()), "\n")
print("Alignments by dbs:\n")
for db, formats in r["alignments"].items():
    print(" ", db)
    for fmt, alignment_obj in formats.items():
        aln = alignment_obj["alignment"]
        print("   ", alignment_obj["format"], "lines:", len(aln.split("\n")))
        from textwrap import indent
        print(indent(aln[:300] + "...", "        | "))
  1. Execute the example:

chmod +x nim_client.py
./nim_client.py

The example will display the alignment results for each database, showing the format and a preview of the alignment content.

Shell Client Example#

You can also use curl to send requests directly:

curl http://localhost:8000/biology/colabfold/msa-search/predict \
  -H "Content-Type: application/json" \
  -d '{
    "sequence": "SGSMKTAISLPDETFDRVSRRASELGMSRSEFFTKAAQR",
    "e_value": 0.0001,
    "iterations": 1,
    "output_alignment_formats": ["a3m", "fasta"]
  }' | sed 's/\\n/\n/g' | head -n 25 && echo "...trimmed..."

Next Steps#

  • Performance - View benchmarking results and learn how to run your own performance tests

  • Configuration - Configure environment variables, GPU selection, and volume mounting

  • Optimization and Scaling - Learn about scaling strategies for production deployments

  • API Reference - Reference the comprehensive API documentation with all available parameters