`filters.models.qe_models`#

Module Contents#

Classes#

`COMETQEModel`	Wrapper class for any COMET quality estimation models (https://github.com/Unbabel/COMET).
`PyMarianQEModel`	Abstract model for all quality estimation models for bitext.
`QEModel`	Abstract model for all quality estimation models for bitext.

Data#

`COMET_IMPORT_MSG`
`PYMARIAN_IMPORT_MSG`
`comet`
`pymarian`

API#

class filters.models.qe_models.COMETQEModel( name: str, model: collections.abc.Callable, gpu: bool = False, )#

Bases: filters.models.qe_models.QEModel

Wrapper class for any COMET quality estimation models (https://github.com/Unbabel/COMET).

Initialization

Args: name (str): A string named of the model. Not directly tied to MODEL_NAME_TO_HF_PATH as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.

MODEL_NAME_TO_HF_PATH: Final[dict[str, str]]#: None

classmethod load_model( model_name: str, gpu: bool = False, ) → filters.models.qe_models.COMETQEModel#: See parent class docstring for details on functionality and arguments.

predict(input_list: list) → list[float]#

Implements quality estimation score prediction for COMET model.

Args: input_list (List): A list of bitext pairs wrapped as dictionaries.

Returns: List[float]: List of quality scores.

static wrap_qe_input( src: str, tgt: str, reverse: bool = False, ) → dict[str, str]#: See parent class docstring for details on functionality and arguments.

filters.models.qe_models.COMET_IMPORT_MSG#: ‘To run QE filtering with COMET, you need to install from PyPI with: pip install unbabel-comet. Mor…’

filters.models.qe_models.PYMARIAN_IMPORT_MSG#: ‘To run QE filtering with Cometoid/PyMarian, you need to install PyMarian. More information at https:…’

class filters.models.qe_models.PyMarianQEModel( name: str, model: collections.abc.Callable, gpu: bool = False, )#

Bases: filters.models.qe_models.QEModel

Abstract model for all quality estimation models for bitext.

Initialization

Args: name (str): A string named of the model. Not directly tied to MODEL_NAME_TO_HF_PATH as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.

MARIAN_CPU_ARGS#: ‘ –cpu-threads 1 -w 2000’

MARIAN_GPU_ARGS#: ‘ -w 8000 –mini-batch 32 -d 0’

MODEL_NAME_TO_HF_PATH: Final[dict[str, str]]#: None

SHARD_SIZE#: 5000

classmethod load_model( model_name: str, gpu: bool = False, ) → filters.models.qe_models.PyMarianQEModel#: See parent class docstring for details on functionality and arguments.

predict(input_list: list) → list[float]#

Implements quality estimation score prediction for Cometoid/PyMarian model.

Args: input_list (List): A list of bitext pairs wrapped as dictionaries.

Returns: List[float]: List of quality scores.

static wrap_qe_input(src: str, tgt: str, reverse: bool = False) → list[str]#: See parent class docstring for details on functionality and arguments.

class filters.models.qe_models.QEModel(name: str, model: collections.abc.Callable, gpu: bool = False)#

Bases: abc.ABC

Abstract model for all quality estimation models for bitext.

Initialization

Args: name (str): A string named of the model. Not directly tied to MODEL_NAME_TO_HF_PATH as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.

abstract classmethod load_model(model_name: str) → filters.models.qe_models.QEModel#

An abstract method that loads the model according to a model name.

Args: model_name (str): The name of the model to be loaded. Could be a huggingface model name, a path, or something else, depending on the implementation.

abstract predict(**kwargs) → list[float]#

An abstract method that calls the underlying model to produce estimated quality scores.

Returns: List[float]: List of quality scores.

abstract static wrap_qe_input(src: str, tgt: str, reverse: bool = False) → list[str]#

An abstract method that implements the following: given the individual source and target string of the bitext, wrap them into proper format that can be accepted by the underlying model.

Args: src (str): Source side string of the bitext. tgt (str): Target side string of the bitext. reverse (bool, optional): Whether to reverse the source and target side of the bitext. Defaults to False.

filters.models.qe_models.comet#: ‘safe_import(…)’

filters.models.qe_models.pymarian#: ‘safe_import(…)’

filters.models.qe_models#

Module Contents#

Classes#

Data#

API#

`filters.models.qe_models`#