Advanced features for DoMINO-Automotive-Aero NIM#

Use this documentation to learn about the advanced features of the DoMINO-Automotive-Aero NIM.

Inference with Custom Checkpoints#

By default, when launching the NIM container, a pre-trained model checkpoint is automatically pulled from the NGC registry and used to instantiate the model. However, users may wish to run inference with their own custom checkpoints—such as those trained on proprietary datasets or fine-tuned for specific applications.

To use a custom checkpoint, simply mount your checkpoint file into the container at runtime using Docker’s volume mounting feature.

export NGC_API_KEY=<NGC API Key>

docker run --rm --runtime=nvidia --gpus 1 --shm-size 2g \
    -p 8000:8000 \
    -e NGC_API_KEY \
    -v <PATH_TO_CUSTOM_CHECKPOINT>:/opt/nim/custom_checkpoint/checkpoint.pt \
    -t nvcr.io/nim/nvidia/domino-automotive-aero:2.0.0

Replace <PATH_TO_CUSTOM_CHECKPOINT> with the full path to your custom .pt checkpoint file.

How it works:
When the NIM container starts, it checks for the presence of a checkpoint at /opt/nim/custom_checkpoint/checkpoint.pt. If this file exists, the container uses your custom checkpoint for model initialization and skips downloading the default checkpoint from NGC.

This allows you to easily run inference with your own trained or fine-tuned models.

Multi-GPU Support#

The NIM container is built to fully leverage systems with multiple GPUs. When multiple GPUs are available, the inference server can automatically distribute incoming requests across the available devices, allowing for concurrent processing and improved throughput. Each inference request is assigned to a specific GPU, ensuring that different requests can be executed in parallel on different GPUs. This enables efficient scaling for high-throughput or multi-user scenarios, making full use of the hardware resources in your environment without requiring manual device management.

Start your container with the --gpus all option to enable the use of all available GPUs on your system.

docker run --rm --runtime=nvidia --gpus all --shm-size 2g \
    -p 8000:8000 \
    -e NGC_API_KEY \
    -t nvcr.io/nim/nvidia/domino-automotive-aero:2.0.0

Flexible Batched Inference#

The NIM supports flexible batched inference, allowing efficient processing of large numbers of query points. During inference, the set of query points is automatically divided into batches, and each batch is processed sequentially until all points have been evaluated. The batch size used for inference is configurable via the batch_size parameter in the inference endpoints. By adjusting this parameter, users can optimize performance and maximize GPU utilization based on the available hardware resources. Selecting an appropriate batch size helps balance memory usage and throughput, ensuring efficient and scalable inference for a wide range of workloads.

Inference on Custom Volume Point Clouds#

By default, for volume predictions, the NIM samples query points within the computational domain using a uniform random distribution. However, users may wish to use a different set of query points, such as a custom point cloud with higher density near the car surface or a point cloud derived from simulation mesh nodes. To enable this, instead of specifying the point_cloud_size parameter, users can provide the point_cloud parameter, which should point to a .npy file containing the desired point cloud coordinates (with shape (N, 3)). This allows for greater flexibility and control over where predictions are made, supporting advanced workflows and custom analysis requirements. Below is a sample client code that shows how custom point clouds can be used.

import io, httpx, numpy

url = "http://localhost:8000/v1/infer"
point_cloud_path = 'random_points.npy'
stl_file_path = 'drivaer_112_single_solid.stl'

data = {
    "stream_velocity": "30.0", 
    "stencil_size": "1",
    "batch_size": "128000",
}
with open(stl_file_path, "rb") as stl_file, open(point_cloud_path, "rb") as pc_file:
    files = {
        "design_stl": (stl_file_path, stl_file),
        "point_cloud": ("point_cloud.npy", pc_file)
    }
    r = httpx.post(url, files=files, data=data, timeout=120.0)
if r.status_code != 200:
    raise Exception(r.content)
with numpy.load(io.BytesIO(r.content)) as output_data:
    output_dict = {key: output_data[key] for key in output_data.keys()}  
print(output_dict.keys())