Featured Community Checkpoints#
Community fine-tunes built on NVIDIA NeMo ASR checkpoints and published on Hugging Face. For NVIDIA-published checkpoints, see ASR Model Checkpoints and the NVIDIA Hugging Face organization.
Note
Community checkpoints are maintained by their authors, not by the NeMo team. Use each model’s Hugging Face model card and the framework project linked below for up-to-date setup and inference instructions.
Checkpoint |
What’s special |
Framework |
|---|---|---|
SALT multilingual ASR for 10 East African languages. Hybrid TDT+CTC FastConformer (600M), fine-tuned from parakeet-tdt-0.6b-v3. |
NeMo |
|
German medical documentation ASR (PEFT). WER 11.73% → 3.28% on a 122-sample medical eval set. |
NeMo |
|
ATC English ASR on jacktol/ATC-ASR-Dataset. Test WER 5.99%. |
NeMo |
|
Swahili ASR fine-tune on ~5 hours of Common Voice data. |
NeMo |
|
German medical/neurology ASR for Apple Silicon. WER 1.04% on the author’s medical validation set. |
MLX |
|
Quantised Parakeet TDT (Q4_K ~467 MB). 25 EU languages, word-level timestamps. |
GGUF (CrispASR) |
|
Quantised Canary 1B (Q4_K ~673 MB). Multilingual ASR and speech translation. |
GGUF (CrispASR) |
Submit a Community Checkpoint#
To suggest a checkpoint for this page, open a GitHub issue with the Hugging Face model link, NeMo base checkpoint, task, languages, evaluation results, and inference framework.