Getting Started
Please follow instructions in Clara installation to install the docker image.
Additionally, you need to create and mount a folder for the AIAA server to save all the models, logs and configurations.
To create a folder to store the data:
mkdir aiaa_experiments
# change the permission because AIAA is running as non-root
chmod 777 aiaa_experiments
The system requirements of AIAA depend on how many models you want to load in the server. If your models are big and you don’t have enough system RAM/GPU memory, please load one model at a time.
In this release, the Triton server is moved outside the Clara-Train docker image. To run AIAA with the Triton backend, we need to run two containers. We recommend using docker-compose to start AIAA.
Please follow docker-compose installation guide to install docker-compose. And also check docker GPU support prerequisites
Create a file called docker-compose.yml
and paste the following contents:
version: "3.8"
services:
clara-train-sdk:
image: nvcr.io/nvidia/clara-train-sdk:v4.0
command: >
sh -c "chmod 777 /workspace &&
start_aiaa.sh --workspace /workspace --engine TRITON --triton_ip tritonserver \
--triton_proto ${TRITON_PROTO} \
--triton_start_timeout ${TRITON_START_TIMEOUT} \
--triton_model_timeout ${TRITON_MODEL_TIMEOUT} \
--triton_verbose ${TRITON_VERBOSE}"
ports:
- "${AIAA_PORT}:5000"
volumes:
- ${AIAA_WORKSPACE}:/workspace
networks:
- aiaa
shm_size: 1gb
ulimits:
memlock: -1
stack: 67108864
depends_on:
- tritonserver
logging:
driver: json-file
tritonserver:
image: nvcr.io/nvidia/tritonserver:21.02-py3
command: >
sh -c "chmod 777 /triton_models &&
/opt/tritonserver/bin/tritonserver \
--model-store /triton_models \
--model-control-mode="poll" \
--repository-poll-secs=5 \
--log-verbose ${TRITON_VERBOSE}"
volumes:
- ${AIAA_WORKSPACE}/triton_models:/triton_models
networks:
- aiaa
shm_size: 1gb
ulimits:
memlock: -1
stack: 67108864
restart: unless-stopped
logging:
driver: json-file
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
networks:
aiaa:
Create a file called docker-compose.env
and paste the following contents:
AIAA_PORT=5000
AIAA_WORKSPACE=<YOUR WORKSPACE>
TRITON_START_TIMEOUT=120
TRITON_MODEL_TIMEOUT=30
TRITON_PROTO=grpc
TRITON_VERBOSE=0
Please modify the <YOUR WORKSPACE> to the absolute path on your system where you want to use it as AIAA workspace (the folder you created in this step.)
Then we can use the following command to run:
docker-compose --env-file docker-compose.env -p aiaa_triton up --remove-orphans -d
And this is the command to stop:
docker-compose --env-file docker-compose.env -p aiaa_triton down --remove-orphans
To check the log you can do:
docker-compose --env-file docker-compose.env -p aiaa_triton logs -f -t
To start only one service you can do
docker-compose --env-file docker-compose.env -p aiaa_triton up [the service name]
You can type docker ps
to check if AIAA and Triton is running.
You should see things like
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fdb56eeb6ce7 nvcr.io/nvidia/clara-train-sdk:v4.0 "/usr/local/bin/nvid…" 4 seconds ago Up 2 seconds 6006/tcp, 0.0.0.0:5000->5000/tcp, 8888/tcp aiaa_triton_clara-train-sdk_1
da8a2cd331e4 nvcr.io/nvidia/tritonserver:21.02-py3 "/opt/tritonserver/n…" 5 seconds ago Up 3 seconds aiaa_triton_tritonserver_1
You can modify the “AIAA_PORT” to the port you want to use on the host machine.
You can modify the “device_ids” section under “deploy” section of “tritonserver” service to change the GPU id that you want to use.
Following are the options available while starting the AI-Assisted Annotation Server.
Option |
Default Value |
Example |
Description |
---|---|---|---|
BASIC |
|||
workspace |
Workspace Location for AIAA Server to save all the configurations, logs and models. It is required while starting AIAA |
||
debug |
|
Enable Debug level Logging for AIAA Server logs |
|
admin_hosts |
* |
|
Restrict client hosts who can send /admin requests to AIAA server (for e.g. managing models). * (star) will allow any; otherwise provide a list of client IP addresses/prefixes to allow |
Native/Triton |
|||
engine |
TRITON |
|
The inference engine of AIAA server. Choose from AIAA or Triton. AIAA means to directly use the framework to do inference. |
triton_ip |
localhost |
|
Triton server ip |
triton_http_port |
8000 |
|
Triton server HTTP service port |
triton_grpc_port |
8001 |
|
Triton server GRPC service port |
triton_metrics_port |
8002 |
|
Triton server metrics service port |
triton_proto |
http |
Protocol to communicate with Triton server. (http or grpc) |
|
triton_shmem |
no |
|
Whether to use shared memory communication between AIAA and Triton (no, system, or cuda) |
triton_model_path |
<workspace dir>/triton_models |
Triton models path |
|
triton_start_timeout |
120 |
Wait time in seconds for AIAA to make sure Triton server is up and running |
|
triton_model_timeout |
30 |
Wait time in seconds for AIAA to make sure model is loaded and up for serving in Triton |
|
triton_verbose |
0 |
Set Triton verbose logging level. Zero (0) disables verbose logging and values >= 1 enables verbose logging. |
|
Session |
|||
sessions_default_expiry |
3600 |
–sessions_default_expiry 300 |
The minimum of session expiry in seconds. |
FineTune |
|||
fine_tune |
false |
|
If set to true, will fine tune the models in AIAA automatically based on samples directory (Model Fine-Tune Setup) |
fine_tune_hour |
0 |
|
The scheduled time (0~23) in each day to fine tune all the models |
OTHERS |
|||
client_id |
host-80 |
|
If you are deploying AIAA Server on multi node/AWS with shared disk, you might need this to identify each AIAA server accordingly |
Native mode means that AIAA uses PyTorch to run the inference directly.
Workspace
AIAA Server uses the workspace directory for saving configs, models, logs, etc.
Following are a few files/folders explained from the workspace directory. Note that INTERNAL
means that the file or
directory is managed by the AIAA server and we suggest not to change it.
File/Folder |
Description |
---|---|
aiaa-run-config.json |
|
aiaa-models-config.json |
|
downloads/ |
|
logs/ |
|
models/ |
|
triton_models/ |
|
lib/ |
|
samples/{model}/ |
|
mmars/{model}/ |
|
We suggest you create a new folder to be the workspace so that if you want to delete these files you can just remove that workspace folder.
Logs
Once the server is up and running, you can watch or pull the server logs in the browser.
http://127.0.0.1:$AIAA_PORT/logs (server will fetch last 100 lines from the current log file)
http://127.0.0.1:$AIAA_PORT/logs?lines=-1 to fetch everything from the current log file.
All the URLs with 0.0.0.0 or 127.0.0.1 is only accessible if your browser and server are on the same machine.
The $AIAA_PORT
is the port specified in “docker-compose.env”.
You can use ip a
to find your server’s IP address and check it remotely on http://[server_ip]:$AIAA_PORT/logs.