AutoML search space definition for Clara Train

A training config (typically named config_train.json) consists of the following definitions:

Components in Clara Train

A component defines the configuration of a Python object. Many components are used in a training config: model, loss, optimizer, transforms, metrics, image pipelines, etc.

The general format of component config is:

{
    "name/path": "Class name or path",
    "args": {
        Component init args
    },
    attributes
},

To fully describe a component, you must specify the following:

Class information

Python objects are instantiated from classes. You specify the class path either through the “name” (or “path” element for BYOC).

Args

The args section specifies the values of init args of the Python object. For example, for the following transform component, the args are fields and magnitude:

{
   "name": "ScaleIntensityOscillation",
   "args": {
     "fields": "image",
     "magnitude": 0.10
   }
}

Attributes

Additional attributes can be specified. Currently, the only supported attribute is “disabled”. When a component is disabled, the component is ignored in training. When not specified, the default value of this attribute is false.

An example of the “disabled” attribute is highlighted in the example below for the ScaleIntensityRange transform.

Note

Attributes are not part of the class definition.

First-level parameters

In addition to components, the training config also specifies parameters to control the behavior of the training. Examples of such parameters include epochs, num_training_epoch_per_valid, num_training_epoch_per_valid, etc.

Defining the search space

You can define a search space for args and attributes for any component with a “search” key within the component definition. For example, for searching on the arg probability for the RandomAxisFlip transform component:

{
 "name": "RandomAxisFlip",
 "args": {
   "fields": [
     "image",
     "label"
   ],
   "probability": 0.0
 },
 "search": [
   {
     "domain": "transform",
     "args": ["probability"],
     "type": "float",
     "targets": [0.0, 1.0]
   }
 ]
}

To search attributes, use the @ notation:

"search": [
  {
    "domain": "transform",
    "args": ["@disabled"],
    "type": "enum",
    "targets": [[true], [false]]
  }
]

Notice the @ sign in front of “disabled”. This means that the search is on the “disabled” attribute, not the component’s init args.

Arg alias and reference

You can assign an alias to any search args (init arg or attribute) with the # notation, and then use the value of the search arg by using the alias in another component.

Example 1 - simple alias

In this example, you define the search arg probability in component RandomAxisFlip, assign the alias prob to it (by adding “#prob” after the arg name probability), and then use its value in the component RandomRotate3D via the alias:

{
 "name": "RandomAxisFlip",
 "args": {
   "fields": ["image", "label"],
   "probability": 0.0
 },
 "search": [
   {
     "domain": "transform",
     "args": ["probability#prob"],
     "type": "enum",
     "targets": [0.0, 1.0]
   }
 ]
},
{
 "name": "RandomRotate3D",
 "args": {
   "fields": ["image", "label"],
   "probability": 0.0
 },
 "apply": {
   "probability": "prob"
 }
}

The effect is that the probability used for RandomAxisFlip and RandomRotate3D will be the same value in your runs.

Tip

You can apply the same alias in any number of other components.

Example 2 - disable/enable two components together

In this example, the alias “d” is used for “@disabled” to make the “disabled” attributes for RandomAxisFlip and RandomRotate3D be the same:

{
 "name": "RandomAxisFlip",
 "args": {
   "fields": ["image", "label"],
   "probability": 0.0
 },
 "search": [
   {
     "domain": "transform",
     "args": ["@disabled#d"],
     "type": "enum",
     "targets": [0.0, 1.0]
   }
 ]
},
{
 "name": "RandomRotate3D",
 "args": {
   "fields": ["image", "label"],
   "probability": 0.0
 },
 "apply": {
   "@disabled": "d"
 }
}

Example 3 - use first-level search params to disable/enable multiple components

All first-level params can be searched. Moreover, you can define any number of additional first-level search params, as long as their names do not conflict with existing ones. All first-level params can be used in “apply”.

In this example, the two transforms RandomAxisFlip and RandomRotate3D are made mutually-exclusive:

"search": [
        {
            "domain": "transform",
            "type": "enum",
            "args": ["d1", "d2"],
            "targets": [[true, false], [false, true]]
        }
],
…

    {
     "name": "RandomAxisFlip",
     "args": {
       "fields": ["image", "label"],
       "probability": 0.0
       },
       "apply": {
       "@disabled": "d1"
     }
    }
    {
     "name": "RandomRotate3D",
     "args": {
       "fields": ["image", "label"],
       "probability": 0.0
     },
     "apply": {
       "@disabled": "d2"
     }
    }

Example 4 - try different optimizers

By using the technique in Example 3, you can use AutoML to search against optimizers.

To do this, make sure to define the “optimizer” as a list of optimizer choices. Clara Train has been modified to accept both a list and a dict (which is the single optimizer). When you use a list, make sure one and only one optimizer will be enabled; or training will not start:

"search": [
        {
            "domain": "transform",
            "type": "enum",
            "args": ["d1", "d2"],
            "targets": [[true, false], [false, true]]
        }
],
…

    "optimizer": [
        {
            "name": "NovoGrad",
            "disabled": true,
            "apply": {
                "@disabled": "d1"
            }
        },
        {
            "name": "Adam",
            "disabled": false,
            "apply": {
                "@disabled": "d2"
            }
        }
    ],

Tip

You can do this for model, loss, and LR policy as well.