This section documents ETL management operations with ais etl.
As with global rebalance, dSort, and download, all ETL management commands can also be executed via
ais jobandais show—the commands that, by definition, support all AIS xactions, including AIS-ETL.
In the ais etl namespace, the commands include:
For background on AIS-ETL, getting started, working examples, and tutorials, please refer to:
Top-level ETL commands include init, stop, show, and more:
Additionally, use --help to display any specific command.
AIStore provides two ways to initialize an ETL using the CLI:
This method uses a YAML file that defines how your ETL should be initialized and run.
Note: CLI parameters take precedence over the spec file.
Use this option if you need full control over the ETL container’s deployment—such as advanced init containers, health checks, or if you’re not using the AIS ETL framework.
You can define multiple ETLs in a single YAML file by separating them with the standard YAML document separator ---.
Example:
You may override fields in the spec using CLI flags such as --name, --comm-type, etc.
However, if your YAML file contains multiple ETL definitions, override flags cannot be used and will result in an error.
In such cases, you should either:
To view all currently initialized ETLs in the AIStore cluster, use either of the following commands:
or the equivalent:
This will display all available ETLs along with their current status (initializing, running, stopped, etc.).
To view detailed information about one or more ETL jobs and their configuration, use:
This command displays detailed attributes of each ETL, including:
Note: You can also use the alias
ais show etl <ETL_NAME> [<ETL_NAME> ...]for the same functionality.
Use this command to view errors encountered during ETL processing—either during inline transformations or offline (bucket-to-bucket) jobs.
To list errors from inline object transformations:
Example Output:
To list errors from a specific offline ETL job, include the job ID:
Example Output:
Here, <your-custom-error> refers to the error raised from within your custom transform function (e.g., in Python).
Use the following command to view logs for a specific ETL container:
<ETL_NAME>: Name of the ETL.[TARGET_ID] (optional): Retrieve logs from a specific target node. If omitted, logs from all targets will be aggregated.Stops a running ETL and tears down its underlying Kubernetes resources.
You can also stop ETLs from a specification file:
---.More info ETL Pod Lifecycle
Restarts a previously stopped ETL by recreating its associated containers on each target.
You can also start ETLs from a specification file:
---.More info ETL Pod Lifecycle
Remove (delete) ETL jobs.
You can also remove ETLs from a specification file:
---.More info ETL Pod Lifecycle
Use inline transformation to process an object on-the-fly with a registered ETL. The transformed output is streamed directly to the client.
Output:
Output:
Use runtime arguments for customizable transformations. The argument is passed as a query parameter (etl_args) and must be handled by the ETL web server.
Output:
Learn more: Inline ETL Transformation
For operations on selected objects, use ais object and its subcommands.
In particular, notice two highlighted subcommands:
To transform or copy a single object, you can interchangeably use ais object etl (or ais object cp), or
their respective aliases - as shown below.
This command applies the ETL to the source object and stores the transformed result at the destination location.
<ETL_NAME> is the name of the registered ETLcp indicates copy-and-transform<SOURCE_OBJECT> is the full AIS URL of the object to transform<DESTINATION> is either a specific object or a destination bucket (preserving source name)For details and performance, see technical blog: Single-Object Transformation.
Use offline transformation to process entire buckets or a selected set of objects. The result is saved in a new destination bucket.
Here’s the command’s help as of v3.30:
Output:
Learn more: Offline ETL Transformation