RFdiffusion (Latest)
RFdiffusion (Latest)

Benchmarking

Accuracy benchmarking for RFdiffusion involves generating long sequences. However, given a fixed scaffolding input, there still exists hundreds of possible configurations at the same length. Therefore, we currently only test whether the sequences generated under fixed random number conditions in the NIM container match those generated by the public version on GitHub.

The sequences generated by the GitHub version have undergone manual verification beforehand. This test assesses NIM’s ability to faithfully reproduce these sequences. It’s important to note that accuracy can vary depending on the GPU microarchitecture. When executed on the same architecture, the outputs match exactly. When evaluated across different GPU architectures, we observe a Root Mean Square Error (RMSE) of less than 0.64 Ångströms between atoms in the reference dataset and the generated dataset.

The run time of RFdiffusion depends on several factors, including the number of atoms in the inputs, the number and length of chains, and how many chains need to be generated.

To represent overall performance effectively, we measure RFdiffusion’s main performance characteristic as the average number of generated amino acids per second. We calculate this by normalizing performance results (in milliseconds) by the length of the generated sequence (number of amino acids) and the number of diffusion steps. This provides an intuitive performance metric. For instance, when considering the generation of a protein binder of a specific length using a certain number of diffusion steps, one can estimate the time by multiplying the performance metric by the protein length and then by the number of steps. This calculation gives an estimate of the time needed to generate one protein structure.

Our current measurements show varying performance across different GPU architectures. For the Ampere architecture, we observe up to 127 amino acids generated per second per step. The Ada Lovelace architecture demonstrates improved performance with up to 225 amino acids per second per step. The Hopper architecture shows the highest performance, reaching up to 350 amino acids per second per step.

This NIM comes with a simple benchmarking script that can measure both accuracy and performance. It is useful to make sure that the neural network provides same results on some known proteins.

The script is already packaged in NIM’s docker image. You can view and study the benchmark using following command:

Copy
Copied!
            

docker run --entrypoint cat nvcr.io/nim/ipd/rfdiffusion:1.0.0 /opt/nim/benchmarking.py

To execute the benchmark, follow this sequence:

  1. Make sure NIM is running as described in Quickstart Guide.

  2. Benchmark script automatically downloads test dataset. To save time and bandwidth it is recommended to provide local cache directory. This way the script will be able to reuse already downloaded data. Execute following command to setup cache directory.

Copy
Copied!
            

export LOCAL_NIM_CACHE=~/.cache/nim

  1. Execute the benchmark.

Copy
Copied!
            

docker run -it --net host -v "$LOCAL_NIM_CACHE":/opt/nim/.cache --entrypoint "" \ nvcr.io/nim/ipd/rfdiffusion:1.0.0 \ /opt/nim/benchmarking.py --benchmark-type both

Previous RFdiffusion NIM endpoints
Next Advanced Usage
© Copyright © 2024, NVIDIA Corporation. Last updated on Sep 24, 2024.