`backends.xenna.executor`#

Module Contents#

Classes#

XennaExecutor

Executor that runs pipelines using Cosmos-Xenna. This executor provides integration between the nemo-curator pipeline framework and the Cosmos-Xenna execution engine for distributed processing.

API#

class backends.xenna.executor.XennaExecutor( config: dict[str, Any] | None = None, ignore_head_node: bool = False, )#

Bases: nemo_curator.backends.base.BaseExecutor

Executor that runs pipelines using Cosmos-Xenna. This executor provides integration between the nemo-curator pipeline framework and the Cosmos-Xenna execution engine for distributed processing.

Initialization

Initialize the executor.

Args: config (dict[str, Any], optional): Configuration dictionary with options like: - logging_interval: Seconds between status logs (default: 60) - ignore_failures: Whether to continue on failures (default: False) - max_workers_per_stage: Max workers per stage (default: None) - execution_mode: ‘streaming’ or ‘batch’ (default: ‘streaming’) - cpu_allocation_percentage: CPU allocation ratio (default: 0.95) - autoscale_interval_s: Auto-scaling interval (default: 180) ignore_head_node (bool, optional): Whether to ignore the head node (default: False) Not supported by XennaExecutor. Raises an error if set to True.

execute( stages: list[nemo_curator.stages.base.ProcessingStage], initial_tasks: list[nemo_curator.tasks.Task] | None = None, ) → list[nemo_curator.tasks.Task]#

Execute the pipeline using Cosmos-Xenna.

Args: stages (list[ProcessingStage]): The stages to run initial_tasks (list[Task], optional): The initial tasks to run. Empty list of Task is used if not provided.

Returns: list[Task]: List of output tasks from the pipeline

backends.xenna.executor#

Module Contents#

Classes#

API#

`backends.xenna.executor`#