Getting Started

Installation

Please follow instructions in Clara installation to install the docker image.

Additionally, you need to create and mount a folder for the AIAA server to save all the models, logs and configurations.

To create a folder to store the data:

mkdir aiaa_experiments
# change the permission because AIAA is running as non-root
chmod 777 aiaa_experiments

Attention

The system requirements of AIAA depend on how many models you want to load in the server. If your models are big and you don’t have enough system RAM/GPU memory, please load one model at a time.

Running AIAA

To run AIAA with the Triton backend, we need to run two containers. We recommend using docker-compose to start AIAA.

Note

Please follow docker-compose installation guide to install docker-compose. And also check docker GPU support prerequisites

Create a file called docker-compose.yml and paste the following contents:

version: "3.8"
services:
  clara-train-sdk:
    image: nvcr.io/nvidia/clara-train-sdk:v4.0
    command: >
      sh -c "chmod 777 /workspace &&
        start_aiaa.sh --workspace /workspace --engine TRITON --triton_ip tritonserver \
          --triton_proto ${TRITON_PROTO} \
          --triton_start_timeout ${TRITON_START_TIMEOUT} \
          --triton_model_timeout ${TRITON_MODEL_TIMEOUT} \
          --triton_verbose ${TRITON_VERBOSE}"
    ports:
      - "${AIAA_PORT}:5000"
    volumes:
      - ${AIAA_WORKSPACE}:/workspace
    networks:
      - aiaa
    shm_size: 1gb
    ulimits:
      memlock: -1
      stack: 67108864
    depends_on:
      - tritonserver
    logging:
      driver: json-file
  tritonserver:
    image: nvcr.io/nvidia/tritonserver:21.02-py3
    command: >
      sh -c "chmod 777 /triton_models &&
        /opt/tritonserver/bin/tritonserver \
          --model-store /triton_models \
          --model-control-mode="poll" \
          --repository-poll-secs=5 \
          --log-verbose ${TRITON_VERBOSE}"
    volumes:
      - ${AIAA_WORKSPACE}/triton_models:/triton_models
    networks:
      - aiaa
    shm_size: 1gb
    ulimits:
      memlock: -1
      stack: 67108864
    restart: unless-stopped
    logging:
      driver: json-file
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            device_ids: ['0']
            capabilities: [gpu]
networks:
  aiaa:

Create a file called docker-compose.env and paste the following contents:

AIAA_PORT=5000
AIAA_WORKSPACE=<YOUR WORKSPACE>
TRITON_START_TIMEOUT=120
TRITON_MODEL_TIMEOUT=30
TRITON_PROTO=grpc
TRITON_VERBOSE=0

Please modify the <YOUR WORKSPACE> to the absolute path on your system where you want to use it as AIAA workspace (the folder you created in this step.)

Then we can use the following command to run:

docker-compose --env-file docker-compose.env -p aiaa_triton up --remove-orphans -d

And this is the command to stop:

docker-compose --env-file docker-compose.env -p aiaa_triton down --remove-orphans

To check the log you can do:

docker-compose --env-file docker-compose.env -p aiaa_triton logs -f -t

To start only one service you can do

docker-compose --env-file docker-compose.env -p aiaa_triton up [the service name]

You can type docker ps to check if AIAA and Triton is running.

Note

You can modify the “AIAA_PORT” to the port you want to use on the host machine.

Note

You can modify the “device_ids” section under “deploy” section of “tritonserver” service to change the GPU id that you want to use.

AIAA Start Options

Following are the options available while starting the AI-Assisted Annotation Server.

Option

Default Value

Example

Description

BASIC

workspace

Workspace Location for AIAA Server to save all the configurations, logs and models. It is required while starting AIAA

debug

--debug

Enable Debug level Logging for AIAA Server logs

admin_hosts

*

--admin_hosts 10.20.1.23,122.34.1.3.4

Restrict client hosts who can send /admin requests to AIAA server (for e.g. managing models). * (star) will allow any; otherwise provide a list of client IP addresses/prefixes to allow

Native/Triton

engine

TRITON

--engine AIAA

The inference engine of AIAA server. Choose from AIAA or Triton. AIAA means to directly use the framework to do inference.

triton_ip

localhost

--triton_ip 10.18.2.13

Triton server ip

triton_http_port

8000

--triton_http_port 9000

Triton server HTTP service port

triton_grpc_port

8001

--triton_grpc_port 9001

Triton server GRPC service port

triton_metrics_port

8002

--triton_metrics_port 9002

Triton server metrics service port

triton_proto

http

Protocol to communicate with Triton server. (http or grpc)

triton_shmem

no

--triton_shmem cuda

Whether to use shared memory communication between AIAA and Triton (no, system, or cuda)

triton_model_path

<workspace dir>/triton_models

Triton models path

triton_start_timeout

120

Wait time in seconds for AIAA to make sure Triton server is up and running

triton_model_timeout

30

Wait time in seconds for AIAA to make sure model is loaded and up for serving in Triton

triton_verbose

0

Set Triton verbose logging level. Zero (0) disables verbose logging and values >= 1 enables verbose logging.

Session

sessions_default_expiry

3600

–sessions_default_expiry 300

The minimum of session expiry in seconds.

FineTune

fine_tune

false

--fine_tune true

If set to true, will fine tune the models in AIAA automatically based on samples directory (Model Fine-Tune Setup)

fine_tune_hour

0

--fine_tune_hour 1

The scheduled time (0~23) in each day to fine tune all the models

OTHERS

client_id

host-80

--client_id aiaa-instance-1

If you are deploying AIAA Server on multi node/AWS with shared disk, you might need this to identify each AIAA server accordingly

Note

Native mode means that AIAA uses PyTorch to run the inference directly.

Troubleshooting

Workspace

AIAA Server uses the workspace directory for saving configs, models, logs, etc.

Following are a few files/folders explained from the workspace directory. Note that INTERNAL means that the file or directory is managed by the AIAA server and we suggest not to change it.

File/Folder

Description

aiaa-run-config.json

INTERNAL - AIAA server setup is stored in this file

aiaa-models-config.json

INTERNAL - AIAA server supports automatic loading of models and corresponding configs are saved in this file

downloads/

INTERNAL - Temporary downloads from NGC happens here and temporary data is removed after successful import of model into AIAA Server

logs/

INTERNAL - AIAA server logs are stored over here

models/

INTERNAL - The models uploaded when running as AIAA backend are stored here

triton_models/

INTERNAL - All serving Triton models are stored here unless user provides a different triton_model_path while starting AIAA Server

lib/

EXTERNAL - Put your custom transforms/inferences/inference pipelines here and use them in config_aiaa.json

samples/{model}/

EXTERNAL - Add your train samples here to trigger an incremental training for a model with new samples (Model Fine-Tune Setup)

mmars/{model}/

INTERNAL/EXTERNAL - Put your MMAR here to trigger an incremental training for a model with new samples. Also AIAA server archives any imported MMARs here

Note

We suggest you create a new folder to be the workspace so that if you want to delete these files you can just remove that workspace folder.

Logs

Once the server is up and running, you can watch or pull the server logs in the browser.

Note

All the URLs with 0.0.0.0 or 127.0.0.1 is only accessible if your browser and server are on the same machine.

The $AIAA_PORT is the port specified in “docker-compose.env”.

You can use ip a to find your server’s IP address and check it remotely on http://[server_ip]:$AIAA_PORT/logs.