Configuring the NIM at Runtime#
The AlphaFold2-Multimer NIM allows a number of customizations at runtime that allow for:
Tuning the performance of the NIM.
Taking advantage of your unique hardware.
Modifying runtime variables (such as the location of the NIM cache).
Below, we detail the various AlphaFold2-Multimer NIM-specific runtime configuration options.
Brief Review: Starting the AlphaFold2-Multimer NIM#
Let’s first review how to start the NIM:
export LOCAL_NIM_CACHE=~/.cache/nim
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
-p 8000:8000 \
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0
Remember:
-p
defines the port to interact with and query the service endpoints of the NIM.-e
options define environment variables, which are passed into the NIM container at runtime.--rm
stops and removes the container when the container terminates, i.e. viaCTRL+C
or other termination signals.-it
allows interacting with the container terminal directly in the CLI, e.g.CTRL+C
.
Note
If the docker run
command fails due to file permission errors after downloading the AlphaFold2 model, run the following command: sudo chmod -R 777 $LOCAL_NIM_CACHE
. Afterward, rerun the NIM using the docker run
command.
Configuring the NIM Cache Location#
Running the NIM container automatically downloads the AlphaFold2 model to a default path in the container filesystem: /opt/nim/.cache
. However, the location of the NIM model cache can be changed in both the local filesystem and within the NIM. This is useful when a larger drive is needed to store the MSA databases, which are responsible for a large fraction of the AlphaFold2 model’s size (~613 GB), or when a user wants to customize their workspace within the NIM container to add additional functionalities (e.g. modified entrypoints or mounted plug-ins) to the NIM.
To change the NIM cache location on the host filesystem, change the source mount location in the NIM startup command, which we refer to as the environment variable LOCAL_NIM_CACHE
:
export LOCAL_NIM_CACHE=/mount/largedisk/nim/.cache
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
-p 8000:8000 \
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0
This will only affect the location outside the container, i.e. on the host running your container. To change the location of the cache mounted inside the container, you must:
Make sure to mount the new destination path, i.e.
-v $LOCAL_NIM_CACHE:$NIM_CACHE_PATH
.Provide the new path in the
NIM_CACHE_PATH
environment variable.
Here’s an example where we move the cache within the container to /workdir/cache/af2/
:
export LOCAL_NIM_CACHE=/mount/largedisk/nim/.cache
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-e NIM_CACHE_PATH=/workdir/cache/af2 \ ## Note: we must set this environment variable...
-v $LOCAL_NIM_CACHE:/workdir/cache/af2 \ ## and provide the destination mount.
-p 8000:8000 \
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0
Air-Gapped NIM Deployment#
To download, detect, and verify the content of the cached model, the NIM needs internet connectivity to NGC. This is impossible when running the NIM in an offline or air-gapped environment.
To work around these constraints, configure the NIM to ignore model download and validation errors on startup, and mount a locally downloaded AF2 model to the offline NIM server. The workflow is as follows:
Download the model to local disk that can be mounted to the offline host for the NIM.
Mount the cached model to your offline host when running the NIM.
Set
NIM_IGNORE_MODEL_DOWNLOAD_FAIL=true
so that the container attempts model download and detection, but fails gracefully and does not terminate.
To download the model to your local disk, run the following command:
ngc registry model download-version nim/deepmind/alphafold2-data:1.1.0
Start the NIM container and mount the cached model from the path specified in $LOCAL_NIM_CACHE
:
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-e NIM_IGNORE_MODEL_DOWNLOAD_FAIL=true \
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
-p 8000:8000 \
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0
Using Alternative Ports for NIM Queries / Requests#
If you have other HTTP or gRPC servers running (for example, other NIMs), you may need to make the 8000 port available for those other servers by using another port for your NIM. This can be done by:
Changing the exposed port by setting an alternative port, i.e.
-p $HOST_PORT:$CONTAINER_PORT
.Update the
NIM_HTTP_API_PORT
environment variable to the new$CONTAINER_PORT
.
Here’s an example where we set our NIM to run (symmetrically) on port 7979
:
export LOCAL_NIM_CACHE=/mount/largedisk/nim/.cache
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-e NIM_HTTP_API_PORT=7979 \ ## We must set the NIM_HTTP_API_PORT environment variable...
-e NIM_CACHE_PATH=/workdir/cache/af2 \
-v $LOCAL_NIM_CACHE:/workdir/cache/af2 \
-p 7979:7979 \ ## as well as forward the port to host.
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0
Tuning MSA Runners and Threads#
The AlphaFold2-Multimer NIM has been optimized in many ways. One such optimization includes the ability to tune the MSA process for better performance on your machine, based upon the number of available CPU cores, number of available GPUs, and speed of your available SSD. (Note: We do not recommend running the AlphaFold2-Multimer NIM on a machine that uses a hard disk drive for storage, as it will significantly decrease the performance of the NIM.)
There are two dimensions of tunable parallelism for MSA within the AlphaFold2-Multimer NIM, set by two different environment variables:
NIM_PARALLEL_MSA_RUNNERS
: Controls the coarse-grained parallelism of the MSA process.NIM_PARALLEL_THREADS_PER_MSA
: Controls the number of threads used by each MSA process.
Tip
The product of NIM_PARALLEL_MSA_RUNNERS
and NIM_PARALLEL_THREADS_PER_MSA
should be ≥ the number of CPU cores available to the NIM.
For example, on a machine with 48 CPU cores, an appropriate setting would be:
NIM_PARALLEL_MSA_RUNNERS=4
NIM_PARALLEL_THREADS_PER_MSA=12
This would allocate 4 runners, each with 12 threads.
Note
NVIDIA does not recommend using more than 5 process runners, and the NIM performs best when the number of threads per MSA is a factor of the number of threads in your CPU. The default values of NIM_PARALLEL_MSA_RUNNERS=3
and NIM_PARALLEL_THREADS_PER_MSA=8
work for most machines.
To modify these variables at runtime, pass them as environment variables in the NIM startup command. For example, on a 64-core machine, when running many MSA databases,
we might increase NIM_PARALLEL_MSA_RUNNERS
to 6
and reduce NIM_PARALLEL_THREADS_PER_MSA
to 10
so that we fit our workloads to our machine well:
docker run -it --rm --name alphafold2-multimer --runtime=nvidia \
-e CUDA_VISIBLE_DEVICES=0 \
-e NGC_CLI_API_KEY \
-e NIM_PARALLEL_MSA_RUNNERS=6 \ ## We set the number of parallel runners to 6, perhaps because we have used many MSA databases...
-e NIM_PARALLEL_THREADS_PER_MSA=10 \ ## and we set the number of threads per MSA to 12, to distribute alignment workloads...
-v $LOCAL_NIM_CACHE:/opt/nim/.cache \
-p 8000:8000 \
nvcr.io/nim/deepmind/alphafold2-multimer:1.0.0