AutoML user guide

Specify search space

The mechanisms behind AutoML remain the same in Clara Train 4.1 as they did in previous versions of Clara Train although the components will need to be available in Clara Train 4.1. See AutoML search space definition for Clara Train for additional details.

Configuring the “search” section

You make the component’s init args searchable by adding it to the “search” section of the component, which is a defined as a list.

Each item in the “search” list specifies the search ranges for one or more args:

  • domain - the search domain of the args. Currently lr, net, transform.

  • type - data type of the search. Currently float, enum.

  • args - list of arg names. They must be existing args in the component’s init args.

  • targets- the search range candidates. Its format depends on the “type”.

Float type

For the “float” type, “targets” is a list of two numbers - min and max of the range. If multiple args are specified in the “args”, the same search result (which is a float number) is applied to all of them.

Enum type

For the “enum” type, “targets” is a list of choices, and each choice is a list of values, one for each arg in the args list.

Examine this following example:

"search": [
     "args": ["if_use_psp", "final_activation"],
     "domain": "net",
     "type": "enum",
     "targets": [[true, "softmax"], [false, "sigmoid"], [true, "sigmoid"]]

Two args are specified in “args” (“if_use_psp” and “final_activation”). There are three target choices:

  • Choice 0: [true, “softmax”]

  • Choice 1: [false, “sigmoid”]

  • Choice 2: [true, “sigmoid”]

If the search result is choice 2, then true is assigned to “if_use_psp”, and “sigmoid” is assigned to “final_activation”.

This supports the use case of args being related and needing to be searched together.

Command Line Interface (CLI) for AutoML

To start Clara Train based AutoML, simply run in the “commands” folder of the MMAR. is a very simple shell script:

#!/usr/bin/env bash

my_dir="$(dirname "$0")"
. $my_dir/

echo "MMAR_ROOT set to $MMAR_ROOT"


# Data list containing all data
python -u -m medl.apps.automl.train \
   -m $MMAR_ROOT \
   --set \
   run_id=a \
   workers=0:1 \

The most important details to note are the settings of run_id and workers. The script sets their default values, but you can overwrite them by specifying them explicitly in the command line.

Specify run_id

As described above, run_id represents one AutoML experiment. Each experiment must have a unique run_id. To specify a run_id, simply append the following to the command line when running


Specify workers

You must define how many workers to use and assign GPU devices to each worker. The syntax is this:


For each worker, you specify a list of GPU device IDs, separated by commas. Worker specs are separated by colons.

Examples for running AutoML

To run AutoML with run ID “test1” and two workers assigned to GPU 0 and 1 respectively: run_id=test1 workers=0:1

AutoML worker names

Workers are named like:


where workerId is an integer starting from 1 (e.g. W1, W2, etc.).


Worker names are used as a prefix to jobs’ MMAR names.

How to configure workers efficiently for AutoML?

When multiple GPUs are available, how can they be used efficiently? Should each job be executed with multiple GPUs, or should each job be assigned a single GPU? The answer is: it depends.

If multiple recommendations are produced each time by the controller, it might be more efficient to run each job with a single GPU. You still keep all GPUs busy since all jobs are run in parallel, and you can avoid cross-device synchronization overhead of a multi-gpu training (in case of horovod).

However, if the controller always produces a single recommendation each time based on the previous job score, then there would be no parallel job execution. In this case, you should arrange to run the job with multiple GPUs. Note that there may be limitations to assigning multiple GPUs to multiple workers, so a single worker with multiple GPUs may be optimal.

If the controller is implemented in a phased approach, with multiple recommendations produced then single recommendations produced, it can get tricky to optimally configure the workers.

Custom name for config_automl.json

AutoML can support user specified names for the AutoML config file via the command line, as highlighted in this example of

python -u -m medl.apps.automl.train \
    -m $MMAR_ROOT \
    --automlconf my_custom_config_automl.json \
    --set \
    run_id=a \
    workers=0:1 \
    traceout=both \
    trainconf=config_train_for_automl.json \


my_custom_config_automl.json must be in the MMAR’s “config” folder!

When AutoML is started, the file name of the AutoML config file that is used will be printed. Make sure it is what you specified.