Scalable Molecular Docking (Latest)
Scalable Molecular Docking (Latest)

Advanced Usage

In this example we create a simple bash script to launch inference using two local files as input and dump the generated poses in the output folder.

  1. Create a new blank file in the same folder, name it as diffdock.sh and copy the content below into it.

Copy
Copied!
            

#!/bin/bash # Script: diffdock.sh - Run inference using local files as input # Usage: ./diffdock.sh [receptor].pdb [ligand].sdf protein_file=$1 ligand_file=$2 protein_bytes=`grep -E ^ATOM $protein_file | sed -z 's/\n/\\\n/g'` ligand_bytes=`sed -z 's/\n/\\\n/g' $ligand_file` ligand_format=`basename $ligand_file | awk -F. '{print $NF}'` echo "{ \"ligand\": \"${ligand_bytes}\", \"ligand_file_type\": \"${ligand_format}\", \"protein\": \"${protein_bytes}\", \"num_poses\": 10, \"time_divisions\": 20, \"steps\": 18, \"save_trajectory\": false, \"is_staged\": false }" > diffdock.json curl --header "Content-Type: application/json" \ --request POST \ --data @diffdock.json \ --output output.json \ http://localhost:8000/molecular-docking/diffdock/generate


  1. Make the script executable.

Copy
Copied!
            

chmod +x diffdock.sh


  1. Download the input files from RCSB database, and launch the inference.

Copy
Copied!
            

curl -o 8G43.pdb https://files.rcsb.org/download/8G43.pdb curl -o ZU6.sdf https://files.rcsb.org/ligands/download/ZU6_ideal.sdf ./diffdock.sh 8G43.pdb ZU6.sdf


  1. Dump the output using the python script created in Getting Started.

Copy
Copied!
            

python3 dump_output.py ls output


  1. Example of output

Copy
Copied!
            

rank01_confidence_0.57.sdf rank06_confidence_-0.25.sdf rank02_confidence_0.57.sdf rank07_confidence_-0.69.sdf rank03_confidence_0.55.sdf rank08_confidence_-1.31.sdf rank04_confidence_0.41.sdf rank09_confidence_-1.90.sdf rank05_confidence_0.38.sdf rank10_confidence_-2.07.sdf


DiffDock NIM allows for a Batch-Docking mode, which docks a group of ligand molecules against the same protein receptor through a single inference request, if a multi-molecule SDF file is submitted in this request. Compared with running mulitple inference requests one-by-one, it’s much more efficient. The example below is a batch-docking using a protein PDB file with five molecule SDF files that are downloaded from RSCB.

  1. Prepare the SDF input file with multiple ligand molecules. Create a new blank file, name it as make-multiligand.sh and copy the content below into it.

Copy
Copied!
            

#!/bin/bash # Script: make-multiligand.sh # Usage: ./make-multiligand.sh [Ligand1_CCD_ID] [Ligand2_CCD_ID] ... # Example: ./make-multiligand.sh COM Q4H QPK R4W SIN ligan_files="" for lig in $* do ligand_file=${lig}.sdf echo "Download ligand file:${ligand_file}" curl -o $ligand_file "https://files.rcsb.org/ligands/download/${lig}_ideal.sdf" ligand_files="${ligand_files}${ligand_file}" done # Combine ligand files into a single SDF file cat $ligand_files > multi_ligands.sdf


  1. Run the commands below to generate the multi_ligands.sdf for input.

Copy
Copied!
            

chmod +x make-multiligand.sh ./make-multiligand.sh COM Q4H QPK R4W SIN


  1. Download the protein PDB file and launch the inference.

Copy
Copied!
            

curl -o 7RWO.pdb "https://files.rcsb.org/download/7RWO.pdb" ./diffdock.sh 7RWO.pdb multi_ligands.sdf


  1. Dump the result and an example of output is below.

Copy
Copied!
            

python3 dump_output.py ls output/* diffdock-output/ligand0: rank01_confidence_-0.74.sdf rank05_confidence_-1.15.sdf rank09_confidence_-1.55.sdf rank02_confidence_-0.92.sdf rank06_confidence_-1.25.sdf rank10_confidence_-1.93.sdf rank03_confidence_-0.93.sdf rank07_confidence_-1.46.sdf rank04_confidence_-1.04.sdf rank08_confidence_-1.46.sdf diffdock-output/ligand1: rank01_confidence_-0.25.sdf rank05_confidence_-0.55.sdf rank09_confidence_-0.72.sdf rank02_confidence_-0.28.sdf rank06_confidence_-0.55.sdf rank10_confidence_-0.77.sdf rank03_confidence_-0.34.sdf rank07_confidence_-0.56.sdf rank04_confidence_-0.49.sdf rank08_confidence_-0.57.sdf ...


Besides the SDF format for ligand molecules, DiffDock also support SMILES text strings as the input. DiffDock uses RDKit to generate random molecular conformers from the SMILES information. A plain text file can be used as the ligand input with multiple lines, each of which is a SMILES formula representing a molecule, to conduct batch-docking.

  1. Create a new blank file, name it as ligands.txt and copy the content below into it.

Copy
Copied!
            

Cc1cc(F)c(NC(=O)NCCC(C)(C)C)cc1Nc1ccc2ncn(C)c(=O)c2c1F COc1cccc(NC(=O)c2ccc(C)c(Nc3nc(-c4cccnc4)nc4c3cnn4C)c2)c1 Cc1nn(C)c(C)c1CCOc1cc(F)ccc1-c1ccc2n[nH]c(CN(C)C)c2c1 Cc1c(C(=O)c2cccc3ccccc23)c2cccc3c2n1[C@H](CN1CCOCC1)CO3


  1. Run the commands below to invoke the diffdock model. The script will generate an input JSON file, and return the inference result in JSON format in the file output.json.

Copy
Copied!
            

./diffdock.sh 8G43.pdb ligands.txt


  1. Dump the result and check the output folder.

Copy
Copied!
            

$ python3 dump_output.py $ ls output/* diffdock-output/ligand0: rank01_confidence_-0.98.sdf rank05_confidence_-1.30.sdf rank09_confidence_-1.77.sdf rank02_confidence_-1.00.sdf rank06_confidence_-1.36.sdf rank10_confidence_-2.27.sdf rank03_confidence_-1.03.sdf rank07_confidence_-1.58.sdf rank04_confidence_-1.21.sdf rank08_confidence_-1.61.sdf diffdock-output/ligand1: rank01_confidence_-0.15.sdf rank05_confidence_-1.25.sdf rank09_confidence_-1.55.sdf rank02_confidence_-0.54.sdf rank06_confidence_-1.29.sdf rank10_confidence_-1.66.sdf rank03_confidence_-0.91.sdf rank07_confidence_-1.38.sdf rank04_confidence_-1.03.sdf rank08_confidence_-1.39.sdf ...


Previous API Reference
Next Performance
© | | | | | | |. Last updated on Jul 25, 2024.