filters.models.qe_models
#
Module Contents#
Classes#
Wrapper class for any COMET quality estimation models (https://github.com/Unbabel/COMET). |
|
Abstract model for all quality estimation models for bitext. |
|
Abstract model for all quality estimation models for bitext. |
Data#
API#
- class filters.models.qe_models.COMETQEModel(
- name: str,
- model: collections.abc.Callable,
- gpu: bool = False,
Bases:
filters.models.qe_models.QEModel
Wrapper class for any COMET quality estimation models (https://github.com/Unbabel/COMET).
Initialization
Args: name (str): A string named of the model. Not directly tied to
MODEL_NAME_TO_HF_PATH
as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.- MODEL_NAME_TO_HF_PATH: Final[dict[str, str]]#
None
- classmethod load_model(
- model_name: str,
- gpu: bool = False,
See parent class docstring for details on functionality and arguments.
- predict(input_list: list) list[float] #
Implements quality estimation score prediction for COMET model.
Args: input_list (List): A list of bitext pairs wrapped as dictionaries.
Returns: List[float]: List of quality scores.
- static wrap_qe_input(
- src: str,
- tgt: str,
- reverse: bool = False,
See parent class docstring for details on functionality and arguments.
- filters.models.qe_models.COMET_IMPORT_MSG#
‘To run QE filtering with COMET, you need to install from PyPI with:
pip install unbabel-comet
. Mor…’
- filters.models.qe_models.PYMARIAN_IMPORT_MSG#
‘To run QE filtering with Cometoid/PyMarian, you need to install PyMarian. More information at https:…’
- class filters.models.qe_models.PyMarianQEModel(
- name: str,
- model: collections.abc.Callable,
- gpu: bool = False,
Bases:
filters.models.qe_models.QEModel
Abstract model for all quality estimation models for bitext.
Initialization
Args: name (str): A string named of the model. Not directly tied to
MODEL_NAME_TO_HF_PATH
as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.- MARIAN_CPU_ARGS#
‘ –cpu-threads 1 -w 2000’
- MARIAN_GPU_ARGS#
‘ -w 8000 –mini-batch 32 -d 0’
- MODEL_NAME_TO_HF_PATH: Final[dict[str, str]]#
None
- SHARD_SIZE#
5000
- classmethod load_model(
- model_name: str,
- gpu: bool = False,
See parent class docstring for details on functionality and arguments.
- predict(input_list: list) list[float] #
Implements quality estimation score prediction for Cometoid/PyMarian model.
Args: input_list (List): A list of bitext pairs wrapped as dictionaries.
Returns: List[float]: List of quality scores.
- static wrap_qe_input(src: str, tgt: str, reverse: bool = False) list[str] #
See parent class docstring for details on functionality and arguments.
- class filters.models.qe_models.QEModel(name: str, model: collections.abc.Callable, gpu: bool = False)#
Bases:
abc.ABC
Abstract model for all quality estimation models for bitext.
Initialization
Args: name (str): A string named of the model. Not directly tied to
MODEL_NAME_TO_HF_PATH
as defined in some subclasses but it is suggested. model: A loaded model object. The type of the object depends on the loaded model type. gpu (bool, optional): Whether inference is on GPU. Defaults to False.- abstractmethod classmethod load_model(model_name: str) filters.models.qe_models.QEModel #
An abstract method that loads the model according to a model name.
Args: model_name (str): The name of the model to be loaded. Could be a huggingface model name, a path, or something else, depending on the implementation.
- abstractmethod predict(**kwargs) list[float] #
An abstract method that calls the underlying model to produce estimated quality scores.
Returns: List[float]: List of quality scores.
- abstractmethod static wrap_qe_input(src: str, tgt: str, reverse: bool = False) list[str] #
An abstract method that implements the following: given the individual source and target string of the bitext, wrap them into proper format that can be accepted by the underlying model.
Args: src (str): Source side string of the bitext. tgt (str): Target side string of the bitext. reverse (bool, optional): Whether to reverse the source and target side of the bitext. Defaults to False.
- filters.models.qe_models.comet#
‘safe_import(…)’
- filters.models.qe_models.pymarian#
‘safe_import(…)’