Loading Models

There are multiple options to load a model into AIAA.

Loading from NGC

AIAA allows you to load the model directly from NVIDIA GPU Cloud (NGC).

A list of available pre-trained models are in here. You can also use ngc registry model list nvidia/med/clara_* to get a list of models.

The following example is to load the clara_ct_seg_spleen_amp pre-trained model.

curl -X PUT "http://127.0.0.1/admin/model/clara_ct_seg_spleen_amp" \
     -H "accept: application/json" \
     -H "Content-Type: application/json" \
     -d '{"path":"nvidia/med/clara_ct_seg_spleen_amp","version":"1"}'

You can also download the model from NGC and load it.

ngc registry model download-version nvidia/med/clara_ct_seg_spleen_amp:1

curl -X PUT "http://127.0.0.1/admin/model/clara_ct_seg_spleen_amp" \
     -F "config=@clara_ct_seg_spleen_amp_v1/config/config_aiaa.json;type=application/json" \
     -F "data=@clara_ct_seg_spleen_amp_v1/models/model.trt.pb"

Attention

Follow NGC CLI installation to setup NGC CLI first.

Loading from MMAR

If you have already downloaded the MMAR into a local disk, you can use the following approach to load it from the disk.

curl -X PUT "http://127.0.0.1/admin/model/clara_ct_seg_spleen_amp" \
     -F "data=@clara_ct_seg_spleen_amp.with_models.tgz"

Loading TensorFlow Model

If you have trained a TensorFlow (TF) model and zipped the model checkpoint files into some archive (e.g. zip, tar, gz), you can use the following approach to load it into AIAA.

# Zip the checkpoint files
zip model.zip \
    model.ckpt.data-00000-of-00001 \
    model.ckpt.index \
    model.ckpt.meta

curl -X PUT "http://127.0.0.1/admin/model/clara_ct_seg_spleen_amp" \
     -F "config=@config_aiaa.json;type=application/json" \
     -F "data=@model.zip"

Note

If you upload TF checkpoints to AIAA, it will be automatically converted to a TF-TRT optimized model.

Loading TF-TRT Model

If you have model.trt.pb (TF-TRT format), you can load the same into AIAA as follows.

curl -X PUT "http://127.0.0.1/admin/model/clara_ct_seg_spleen_amp" \
     -F "config=@config_aiaa.json;type=application/json" \
     -F "data=@model.trt.pb"

Hint

To get a TF-TRT model you can check https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html. Note that this model is classified as “tensorflow_graphdef” in TRTIS.

Tip

If you are using Clara to train your models, you can also use export.sh to convert your model to a TF-TRT model.

Loading PyTorch Model

If you have model.pt (in TorchScript format), you can load the model into AIAA as follows.

# Import Model
curl -X PUT "http://127.0.0.1/admin/model/segmentation_2d_brain" \
     -F "config=@config_aiaa.json;type=application/json" \
     -F "data=@model.pt"

Attention

Follow Convert PyTorch trained network to convert your PyTorch model to TorchScript format. PyTorch models will be available only if you are running AIAA server with TRTIS engine (this is the default).

Attention

Before running inference or using clients, make sure you can see your models in http://127.0.0.1/v1/models. If not, please follow instructions in Q&A to debug.