Remote Client
The remote client allows you to create experiments using the command line instead of relying on API calls.
Note
Datasets provided in these examples are subject to the following license Dataset License.
Installation
$ pip3 install -y nvidia-transfer-learning-client==5.3.1.dev0
CLI Specs
User authentication is based on the NGC API KEY and can be done with the following command:
BASE_URL=https://api-ea4.tao.ngc.nvidia.com/api/v1
NGC_API_KEY=zZYtczM5amdtdDcwNjk0cnA2bGU2bXQ3bnQ6NmQ4NjNhMDItMTdmZS00Y2QxLWI2ZjktNmE5M2YxZTc0OGyS
$ tao-client login --ngc-api-key $NGC_API_KEY --ngc-org-name ea-tlt
After authentication, the command line syntax is:
$ tao-client <network> <action> <args>
For example:
$ tao-client dino experiment-run-action --action train --id 042559ec-ab3e-438d-9c94-2cab38f76efc --specs '<json_loadable_specs_string_from_get_spec_action>'
Note
You can always use the --help
argument to retrieve the command usage information.
To list supported networks:
$ tao-client -–help
To list supported Dino actions:
$ tao-client dino --help
Object Detection Use Case Example with CLI
Creating the training dataset
This returns a UUID representing the train dataset id for other steps like
dataset_convert
andtrain
. CLI arguments parameters: *type
- one of TAO’s supported dataset types *format
- one of the format supported for the type chosen above *cloud_details
- dictionary of required cloud storage values like bucket name, access credentials with write permissions, etc.TRAIN_DATASET_ID=$(tao-client dino dataset-create --dataset_type object_detection --dataset_format coco --cloud_details '{ "cloud_type": "self_hosted", "cloud_file_type": "file", "cloud_specific_details": { "url": "https://tao-detection-synthetic-dataset-dev.s3.us-west-2.amazonaws.com/tao_od_synthetic_train_coco.tar.gz" } }') echo $TRAIN_DATASET_ID # Cloud details example for AWS CSP # "cloud_type": "aws", # "cloud_specific_details": { # "cloud_region": "us-west-1", # "cloud_bucket_name": "bucket_name", # "access_key": "access_key", # "secret_key": "secret_key" # } # }
To monitor the status of the train dataset download, run (pull) the get-metadata
:
TRAIN_DATASET_PULL_STATUS=$(tao-client dino get-metadata --id $TRAIN_DATASET_ID --job_type dataset)
echo $TRAIN_DATASET_PULL_STATUS
Creating the validation dataset
This returns a UUID representing the eval dataset id for other steps like
dataset_convert
andevaluate
. The possible CLI arguments are the same as for the train dataset.EVAL_DATASET_ID=$(tao-client dino dataset-create --dataset_type object_detection --dataset_format coco --cloud_details '{ "cloud_type": "self_hosted", "cloud_file_type": "file", "cloud_specific_details": { "url": "https://tao-detection-synthetic-dataset-dev.s3.us-west-2.amazonaws.com/tao_od_synthetic_val_coco.tar.gz" } }') echo $EVAL_DATASET_ID
To monitor the status of the validation dataset download, run (pull) the get-metadata
:
EVAL_DATASET_PULL_STATUS=$(tao-client dino get-metadata --id $EVAL_DATASET_ID --job_type dataset)
echo $EVAL_DATASET_PULL_STATUS
Finding a base experiment
The following command lists the base experiments available for use. Pick one that corresponds to Dino and use it in step 5.
BASE_EXP_RESPONSE=$(tao-client dino list-base-experiments --filter_params '{"network_arch": "dino"}') # Post Processing to convert bash output to a json string BASE_EXP_RESPONSE="${BASE_EXP_RESPONSE:1:-1}" BASE_EXP_RESPONSE=$(echo "$BASE_EXP_RESPONSE" | sed "s/'/\"/g") BASE_EXP_RESPONSE=$(echo "$BASE_EXP_RESPONSE" | sed -e "s/None/null/g" -e "s/True/true/g" -e "s/False/false/g") BASE_EXP_RESPONSE=$(echo "$BASE_EXP_RESPONSE" | sed 's/}, {/},\n{/g') BASE_EXP_RESPONSE="[$BASE_EXP_RESPONSE]" BASE_EXPERIMENT_ID=$(echo "$BASE_EXP_RESPONSE" | jq . | jq -r '[.[] | select(.network_arch == "dino") | select(.ngc_path | endswith("pretrained_dino_nvimagenet:resnet50"))][0] | .id') echo $BASE_EXPERIMENT_ID
Creating an experiment
This returns a UUID representing the experiment id for all of the future steps like
train
,evaluate
, andinference
. CLI arguments for creating an experiment are:network_arch
- one of TAO’s supported network architecturesencryption_key
- encryption key for loading the Base Experimentcloud_type
- aws/azurecloud_details
- dictionary of required cloud storage values like bucket name, access credentials with write permissions, etc.
Note
Modify the cloud_details keys based on your cloud bucket credentials.
EXPERIMENT_ID=$(tao-client dino experiment-create --network_arch dino --encryption_key nvidia_tlt --cloud_details '{ "cloud_type": "aws", "cloud_specific_details": { "cloud_region": "us-west-1", "cloud_bucket_name": "bucket_name", "access_key": "access_key", "secret_key": "secret_key" } }') echo $EXPERIMENT_ID
Assign datasets and a base experiment to the experiment
UPDATE_INFO=$(cat <<EOF { "base_experiment": ["$BASE_EXPERIMENT_ID"], "train_datasets": ["$TRAIN_DATASET_ID"], "eval_dataset": "$EVAL_DATASET_ID", "inference_dataset": "$EVAL_DATASET_ID", "calibration_dataset": "$TRAIN_DATASET_ID" } EOF ) EXPERIMENT_METADATA=$(tao-client dino patch-artifact-metadata --id $EXPERIMENT_ID --job_type experiment --update_info "$UPDATE_INFO") echo $EXPERIMENT_METADATA | jq
Key-value pairs:
base_experiment
: Base Experiment ID of the base_experiment from Step 3.train_datasets
: The train dataset IDseval_dataset
: The eval dataset IDinference_dataset
: The test dataset IDcalibration_dataset
: The train dataset IDdocker_env_vars
: Key value pairs of MLOPs settings:wandbApiKey
,clearMlWebHost
,clearMlApiHost
,clearMlFilesHost
,clearMlApiAccessKey
,clearMlApiSecretKey
.
Training an experiment
Get specs:
Returns a JSON loadable string of specs to be used in the train step.
TRAIN_SPECS=$(tao-client dino get-spec --action train --job_type experiment --id $EXPERIMENT_ID) echo $TRAIN_SPECS | jqModify specs:
Modify the specs from the previous step, if necessary.
TRAIN_SPECS=$(echo $TRAIN_SPECS | jq -r '.train.num_epochs=10') TRAIN_SPECS=$(echo $TRAIN_SPECS | jq -r '.train.num_gpus=2') echo $TRAIN_SPECS | jqRun the train action:
CLI arguments for running the train action are:
actions
- action to be executed
specs
- spec dictionary of the action to be executedTRAIN_ID=$(tao-client dino experiment-run-action --action train --id $EXPERIMENT_ID --specs "$TRAIN_SPECS") echo $TRAIN_IDCheck status of training Job:
To monitor the status of training job, run (pull) the
get-action-status
:CLI arguments for geting action metadata are:
id
- ID of the experiment
job
- Job for which action metadata is to be retrieved
job_type
- experimenttao-client dino get-action-status --job_type experiment --id $EXPERIMENT_ID --job $TRAIN_ID | jq
Evaluating an experiment
Get specs:
Returns a JSON loadable string of specs to be used in the evaluate step.
EVALUATE_SPECS=$(tao-client dino get-spec --action evaluate --job_type experiment --id $EXPERIMENT_ID) echo $EVALUATE_SPECS | jqModify specs:
Modify the specs from the previous step, if necessary.
Run the evaluate action.
CLI arguments for running the evaluate action are:
parent_job_id
- ID of the parent job if any
actions
- action to be executed
specs
- spec dictionary of the action to be executedEVALUATE_ID=$(tao-client dino experiment-run-action --action evaluate --id $EXPERIMENT_ID --parent_job_id $TRAIN_ID --specs "$EVALUATE_SPECS") echo $EVALUATE_IDCheck the status of the evaluate job:
To monitor the status of the evaluation job, run (pull) the
get-action-status
:CLI arguments for geting action metadata:
id
- ID of the experiment
job
- Job for which action metadata is to be retrieved
job_type
- experimenttao-client dino get-action-status --job_type experiment --id $EXPERIMENT_ID --job $EVALUATE_ID | jq
Inference for an experiment
Get specs:
Returns a JSON loadable string of specs to be used in the inference step.
INFERENCE_SPECS=$(tao-client dino get-spec --action inference --job_type experiment --id $EXPERIMENT_ID) echo $INFERENCE_SPECS | jqModify specs:
Modify the specs from the previous step, if necessary.
Run the inference action:
CLI arguments for running evaluate action:
parent_job_id
- ID of the parent job if any
actions
- action to be executed
specs
- spec dictionary of the action to be executedINFERENCE_ID=$(tao-client dino experiment-run-action --action inference --id $EXPERIMENT_ID --parent_job_id $TRAIN_ID --specs "$INFERENCE_SPECS") echo $INFERENCE_IDCheck the status of the inference job:
To monitor the status of inference job, run (pull) the
get-action-status
:CLI arguments for geting action metadata:
id
- ID of the experiment
job
- Job for which action metadata is to be retrieved
job_type
- experimenttao-client dino get-action-status --job_type experiment --id $EXPERIMENT_ID --job $INFERENCE_ID | jq
AutoML
AutoML is a TAO Toolkit API service that automatically selects deep learning hyperparameters for a chosen model and dataset.
See the AutoML docs for more details.