Getting Started¶
Prerequisites¶
Please follow instructions in Clara installation to install the docker image and start the container.
Additionally you would like to mount some shared disk/folder for saving all the models, logs and configurations for AIAA server to persist. This will help you to start/stop the container any time without losing the models/configurations for AIAA.
For example:
export NVIDIA_RUNTIME="--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0"
export OPTIONS="--shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864"
export SOURCE_DIR=<source dir to store>
export MOUNT_DIR=/aiaa-experiments
export LOCAL_PORT=<the port you want to use>
export REMOTE_PORT=80
export DOCKER_IMAGE="nvcr.io/clara-train-sdk:<version here>"
docker run $NVIDIA_RUNTIME $OPTIONS -it --rm \
--ipc=host \
-p $LOCAL_PORT:$REMOTE_PORT \
-v $SOURCE_DIR:$MOUNT_DIR \
$DOCKER_IMAGE \
/bin/bash
Attention
The system requirements of AIAA depends on how many models you want to load in the server. If your models are big and you don’t have enough system RAM/GPU memory, please load one model at a time.
Attention
If running on TRTIS backend, by default AIAA will put one model instance on every GPU that is visible inside
the docker. Please use -e NVIDIA_VISIBLE_DEVICES=<ids of the GPU you want to use>
or specify instance group
in TRTIS config for fine-grain resource control.
Running AIAA¶
The simple command to start AI-Assisted Annotation server inside the container is: start_aas.sh However, following are the options available while starting AI-Assisted Annotation Server.
Option |
Default Value |
Example |
Description |
---|---|---|---|
BASIC |
|||
workspace |
/var/nvidia/aiaa |
|
Workspace Location for AIAA Server to save all the configurations, logs and models. Hence always recommended to to use shared workspace while starting AIAA |
debug |
false |
|
Enable Debug level Logging for AIAA Server logs |
auto_reload |
false |
|
AIAA Server will auto-reload models whenever the configuration is updated externally. This option should be enabled, when multiple AIAA Server are installed and sharing a common workspace |
admin_hosts |
* |
|
Restrict client hosts who can send /admin requests to AIAA server (for e.g. managing models). * (star) will allow any; otherwise provide a list of client IP addresses/prefixes to allow |
SSL Support |
|||
ssl |
false |
|
Run AIAA server in ssl mode |
ssl_cert_file |
|
SSL Certificate File Path |
|
ssl_pkey_file |
|
SSL Key File Path |
|
TF/TRTIS |
|||
engine |
TRTIS |
|
Use this option to run AIAA server using Tensorflow (AIAA) or with TRTIS as background inference engine |
trtis_ip |
localhost |
|
If anything other than 127.0.0.1 or localhost is specified, AIAA will not start a local TRTIS server for inference. Otherwise AIAA will start a local instance of TRTIS server |
trtis_port |
8001 |
|
TRTIS server port (http: 8000, grpc: 8001; monitor: 8002) |
trtis_proto |
grpc |
TRTIS server protocol (http or grpc) |
|
trtis_model_path |
<workspace dir>/trtis_models |
TRTIS models path |
|
trtis_start_timeout |
120 |
Wait time in seconds for AIAA to make sure TRTIS server is up and running |
|
trtis_model_timeout |
30 |
Wait time in seconds for AIAA to make sure model is loaded and up for serving in TRTIS |
|
FineTune |
|||
fine_tune |
false |
|
If set to true, will fine tune the models in AIAA automatically based on samples directory (Model Fine-Tune Setup) |
fine_tune_hour |
0 |
|
The scheduled time (0~23) in each day to fine tune all the models |
OTHERS |
|||
monitor |
true |
|
If API instrumentation is allowed. If enabled, you can view monitor dashboard by visiting http://host:port/dashboard |
client_id |
host-80 |
|
If you are deploying AIAA Server on multi node/AWS with shared disk, you might need this to identify each AIAA server accordingly |
use_cupy |
false |
|
Use CUPY for performance boost when you have large GPU memory and more than 1 GPU |
Note
TRTIS stands for the NVIDIA TensorRT Inference Server, which provides a cloud inferencing solution optimized for NVIDIA GPUs. More information can be found in their github page.
Examples¶
# Run with workspace (AIAA Server will be running at: http://127.0.0.1/)
start_aas.sh --workspace /aiaa-experiments/aiaa-1
# Run AIAA to run TF Inference engine
start_aas.sh --workspace /aiaa-experiments/aiaa-1 --engine AIAA
# Run AIAA to run TRTIS Inference engine (TRTIS will be started at 8001 on localhost)
start_aas.sh --workspace /aiaa-experiments/aiaa-1 --engine TRTIS
# Run AIAA to connect to remote TRTIS Inference engine (Debug only; Some options may not work)
start_aas.sh --workspace /aiaa-experiments/aiaa-1 --engine TRTIS --trtis_ip 10.20.11.34 --trtis_model_path /shared/trtis_models
Stop AIAA¶
The simple command to stop AI-Assisted Annotation server inside the container is: stop_aas.sh
Troubleshooting¶
Workspace¶
AIAA Server uses workspace directory (if not specified default shall be insider docker at the path: /var/nvidia/aiaa) for saving configs, models (Frozen), logs etc. You can shutdown and run the docker multiple times, if you mount an external workspace path while running the docker. See the advanced options while running the docker.
Following are few files/folders explained from workspace directory. Note that INTERNAL
means that file or
directory is managed by AIAA server and we suggest not to change it.
File/Folder |
Description |
---|---|
aiaa_server_config.json |
|
aiaa_server_dashboard.db |
|
downloads/ |
|
logs/ |
|
models/ |
|
trtis_models/ |
|
lib/ |
|
samples/{model}/ |
|
mmars/{model}/ |
|
Note
We suggest you create a new folder to be the workspace so that if you want to delete these files you can just remove that workspace folder.
Logs¶
Once the server is up and running, you can watch or pull the server logs in the browser.
http://127.0.0.1/logs (server will fetch last 100 lines from the current log file)
http://127.0.0.1/logs?lines=-1 to fetch everything from the current log file.
Tip
You don’t have to be in bash mode to see the logs files.
Monitor¶
To track API usage, profiling etc, developers can log in and access monitoring dashboard for the AIAA Server.
Tip
Login Username: admin; Default password: admin
Note
All the URL with 0.0.0.0 or 127.0.0.1 is only accessible if your browser and server are on the same machine.
You can use ip a
to find your server’s IP address and check it remotely on http://[server_ip]:[port]/logs.
When you run docker, make sure you have docker ports (e.g. http-port: 80
) mapped to host machine for external access.
You can do that with docker option -p [LOCAL_PORT]:[REMOTE_PORT]
. The local port is the host port you want to use while
the remote port is the port which AIAA listen to inside the docker container (80 for HTTP, 443 for HTTPS).