tasks.tasks
#
Module Contents#
Classes#
Abstract base class for tasks in the pipeline. A task represents a batch of data to be processed. Different modalities (text, audio, video) can implement their own task types. Attributes: task_id: Unique identifier for this task dataset_name: Name of the dataset this task belongs to dataframe_attribute: Name of the attribute that contains the dataframe data. We use this for input/output validations. _stage_perf: List of stages perfs this task has passed through |
Data#
API#
- tasks.tasks.EmptyTask#
‘_EmptyTask(…)’
- tasks.tasks.T#
‘TypeVar(…)’
- class tasks.tasks.Task#
Bases:
abc.ABC
,typing.Generic
[tasks.tasks.T
]Abstract base class for tasks in the pipeline. A task represents a batch of data to be processed. Different modalities (text, audio, video) can implement their own task types. Attributes: task_id: Unique identifier for this task dataset_name: Name of the dataset this task belongs to dataframe_attribute: Name of the attribute that contains the dataframe data. We use this for input/output validations. _stage_perf: List of stages perfs this task has passed through
- add_stage_perf(
- perf_stats: nemo_curator.utils.performance_utils.StagePerfStats,
Add performance stats for a stage.
- data: tasks.tasks.T#
None
- dataset_name: str#
None
- abstract property num_items: int#
Get the number of items in this task.
- task_id: str#
None
- abstractmethod validate() bool #
Validate the task data.