6.1. Operators

Operators are parts of a pipeline definition which “do something”. At the heart of any operator declaration is its container declaration, specifically the image it declares. The image property defines the name of the container image Clara will fetch, deploy, and execute as part of the pipeline execution.

As part of the container declaration, tag and command can also be declared. The tag property is used to determine which version of a given image Clara Deploy SDK should select from its container image repository. The command property is an array (you can tell by the [ and ] characters), and allows a set of strings to be passed to the container and used as the command it executes upon starting.

Operators can also define sets of input and output volumes. These are mapped into the operator’s container at execution time. Logically, the container can treat these as local folders. Physically, these are storage managed by Clara Deploy SDK. The underlying mechanism and details are unimportant. The important part is that operators can be sequenced by declaring that the input of one operator is the output of another. When this happens, Clara Deploy SDK sequences the operator execution; ensuring the output of the first is complete and available before executing the second. Additionally, when an operator defines input from multiple upstream operators, Clara Deploy SDK will ensure all of an operator’s upstream operators have completed before starting its execution.

Let’s take a look at an example.

# More complex and informative example of an operator definition.
api-version: 0.3.0
name: we-love-operators
operators:
  # Operator 1
  - name: i-am-first
    container:
      image: clara/examples/centos
      tag: centos7
      command: ['sh', '-c', 'cp', '-r', '/in', '/out' ]
    input:
      - path: /in
    output:
      - path: /out
  # Operator 2
  - name: last-i-am
    container:
      image: clara/examples/ubuntu
      tag: 18.04
      command: ['sh', '-c', 'cp', '-r', '/my-in', '/my-out' ]
    input:
      # The input::from property tells Clara Deploy SDK that this operator
      # depends on the operator referenced by name.
      - from: i-am-first
        path: /my-in
    output:
      - path: /my-out

The first thing you should notice is that the operators property has more than a single operator declared. In this case, there are two; aptly named i-am-first and last-i-am.

First, let’s look at i-am-first. The operator declares a container with the required image property, and the optional tag and command properties. In this case, the operator is simply copying everything in its /in folder to its /out folder.

If you look, you’ll notice that the /in folder is actually a Clara Deploy SDK managed folder made available to the operator’s container because of the operator’s input declaration. In this case, only the path property is declared. This will map the pipeline’s input to the operator container’s /in folder as a read-only mount. The container execution can then make use of the contents of the folder in any way its author sees fit.

Next, notice that the /out folder is declared as part of the operator’s output property. This means that it too is Clara Deploy SDK managed storage, and Clara Deploy SDK will map the folder as a writable mount. It also means it is available to other operators to read from!

Looking at the second declared operator last-i-am, notice that it too declares input and output values. The interesting part here is that last-i-am declares an input which is from i-am-first. When an input is declared like this, the second operator is determined to be dependent on the first operator because it declares that requires data from it.

During the execution of the pipeline, Clara Deploy SDK will map the pipeline’s input as read-only to the i-am-first operator container’s /in folder, and it will map another storage resource as writable to the i-am-first operator container’s /out folder. It will then run the operator.

When an operator completes, Clara Deploy SDK will map the storage resource it used for the i-am-first operator container’s /out folder to the last-i-am operator container’s /my-in folder as a read-only mount. Next, Clara Deploy SDK will map the last-i-am operator container’s /my-out folder as a writable mount. Then it will run the operator’s container.