nemo_curator.stages.deduplication.id_generator
nemo_curator.stages.deduplication.id_generator
Module Contents
Classes
Functions
Data
CURATOR_ID_GENERATOR_ACTOR_NAME
API
Bases: IdGeneratorBase
Ray actor version of IdGenerator.
Function used by create_id_generator_actor to make sure the actor is started.
Base IdGenerator class without Ray decorator for testing and direct use.
batch_registry
classmethod
Create an id generator actor.
Parameters:
filepath
Path from where we want to load the id generator state json file. If None, a new actor is created.
storage_options
Storage options to pass to fsspec.open.