core.models.huggingface.fastconformer_model#

Module Contents#

Classes#

ParakeetHuggingFaceModel

Wrapper for Parakeet sound encoders.

Functions#

get_nemo_sound_model

Load (and cache) a NeMo ASR encoder + preprocessor for the given nemo:// model id.

Data#

API#

core.models.huggingface.fastconformer_model._NEMO_SOUND_MODEL_CACHE: dict[str, tuple]#

None

core.models.huggingface.fastconformer_model.get_nemo_sound_model(sound_model_type)#

Load (and cache) a NeMo ASR encoder + preprocessor for the given nemo:// model id.

class core.models.huggingface.fastconformer_model.ParakeetHuggingFaceModel(config)#

Bases: megatron.core.models.huggingface.HuggingFaceModule

Wrapper for Parakeet sound encoders.

Supports two backends, selected by config.sound_model_type prefix:

  • nemo://<model_name> loads a NeMo ASR encoder + preprocessor.

  • hf://<model_name> loads the upstream Hugging Face FastConformer model via transformers.AutoModel / AutoFeatureExtractor.

Initialization

_model_dtype() torch.dtype#

Return the dtype of the encoder’s first parameter (defaults to bf16).

_sampling_rate() int#

Return the sampling rate the feature extractor expects (default 16 kHz).

forward(*args, **kwargs)#

Forward pass returning (hidden_states, lengths).

Parameters:
  • args[0] – Sound clips tensor.

  • args[1] – Sound length tensor (used by NeMo backend; ignored for HF).