Configure NVIDIA Earth-2 Correction Diffusion NIM at Runtime#
Use this documentation for details about how to configure the NVIDIA Earth-2 Correction Diffusion (CorrDiff) NIM at runtime.
GPU Selection#
Passing --gpus all
to docker run
is acceptable in homogeneous environments with 1 or more of the same GPU.
In some environments, it is beneficial to run the container on specific GPUs.
Expose specific GPUs inside the container by using either:
The
--gpus
flag, for example--gpus='"device=1"'
.The environment variable
NVIDIA_VISIBLE_DEVICES
, for example-e NVIDIA_VISIBLE_DEVICES=1
.
The device IDs to use as inputs are the output of nvidia-smi -L
.
GPU 0: Tesla H100 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
GPU 1: NVIDIA GeForce RTX 3080 (UUID: GPU-b404a1a1-d532-5b5c-20bc-b34e37f3ac46)
See the NVIDIA Container Toolkit documentation for more instructions.
Model Profiles#
The CorrDiff NIM has the following model profiles that can be used:
CorrDiff US GEFS HRRR#
NIM_MODEL_PROFILE: bf8e1ed158c1bf27d2e36fc4936a3d2989948a3f4e4e80e2b0e7a7124661911c
Corrector Diffusion (CorrDiff) US GEFS-HRRR model down-scales several surface and atmospheric variables from 25-km resolution forecast data from the Global Ensemble Forecast System (GEFS) and predicts 3-km resolution High-Resolution Rapid Refresh (HRRR) data.
Environment Variables#
The CorrDiff NIM allows a few customizations that are referenced on the start up of the container. The below variables can be used to change the NIM behavior.
Variable |
Default |
Description |
---|---|---|
NGC_API_KEY |
Your NGC API key with read access to the model registry for the model profile you are using. |
|
NIM_MODEL_PROFILE |
“bf8e1….1911c” |
The model package to load into NIM on launch. This is downloaded from NGC assuming that you have the correct permissions. |
NIM_HTTP_API_PORT |
8000 |
Publish the NIM service to the specified port inside the container. Make sure to adjust the port passed to the |
NIM_DISABLE_MODEL_DOWNLOAD |
Disable model download on container startup. |
|
EARTH2NIM_TARGET_BATCHSIZE |
8 |
The target sample batch size that the NIM initially splits a request into. This is then dynamically batched across model instances. You might need to lower this for GPUs with lower VRAM. The preferred batch sizes are 4, 8, 12, 16 |
Mounted Volumes#
The following paths inside the container can be mounted to enhance the runtime of the NIM:
Container Path |
Required |
Description |
Example |
---|---|---|---|
|
No |
This is the directory within which models are downloaded inside the container. This directory must be accessible from inside the container. This can be achieved by adding the option |
|