Tutorials#

The best way to get started with NeMo is to start with one of our tutorials.

Most NeMo tutorials can be run on Google’s Colab.

To run a tutorial:

  1. Click the Colab link (see table below).

  2. Connect to an instance with a GPU. For example, click Runtime > Change runtime type and select GPU for the hardware accelerator.

Tutorials#

Domain

Title

GitHub URL

General

Getting Started: Exploring Nemo Fundamentals

NeMo Fundamentals

General

Getting Started: Sample Conversational AI application

Audio translator example

General

Getting Started: Voice swap application

Voice swap example

General

Exploring NeMo Model Construction

NeMo Models

General

Exploring NeMo Adapters

NeMo Adapters

General

Publishing NeMo models on Hugging Face Hub

NeMo Models on HF Hub

ASR

ASR with NeMo

ASR with NeMo

ASR

ASR with Subword Tokenization

ASR with Subword Tokenization

ASR

Offline ASR Inference with Beam Search and External Language Model Rescoring

Offline ASR

ASR

Online ASR inference with Microphone

Online ASR Microphone

ASR

Fine-tuning CTC Models on New Languages

ASR CTC Language Fine-Tuning

ASR

Intro to Transducers

Intro to Transducers

ASR

ASR with Transducers

ASR with Transducers

ASR

ASR with Adapters

ASR with Adapters

ASR

Speech Commands

Speech Commands

ASR

Online and Offline Speech Commands Inference

Online Offline Microphone Speech Commands

ASR

Voice Activity Detection (VAD)

Voice Activity Detection

ASR

Online and Offline VAD Inference

Online Offline Microphone VAD

ASR

Speaker Recognition and Verification

Speaker Recognition and Verification

ASR

Speaker Diarization Inference

Speaker Diarization Inference

ASR

ASR with Speaker Diarization

ASR with Speaker Diarization

ASR

Online Noise Augmentation

Online Noise Augmentation

ASR

ASR for Telephony Speech

ASR for Telephony Speech

ASR

Streaming inference for ASR

Streaming inference

ASR

Buffered Transducer inference for ASR

Buffered Transducer inference

ASR

Buffered Transducer inference with LCS Merge Algorithm

Buffered Transducer inference with LCS Merge

ASR

Offline ASR with VAD for CTC models

Offline ASR with VAD for CTC models

ASR

Self-supervised pre-training for ASR

Self-supervised Pre-training for ASR

NLP

Using Pretrained Language Models for Downstream Tasks

Pretrained Language Models for Downstream Tasks

NLP

Exploring NeMo NLP Tokenizers

NLP Tokenizers

NLP

Text Classification (Sentiment Analysis) with BERT

Text Classification (Sentiment Analysis)

NLP

Question Answering with SQuAD

Question Answering Squad

NLP

Token Classification (Named Entity Recognition)

Token Classification: Named Entity Recognition

NLP

Joint Intent Classification and Slot Filling

Joint Intent and Slot Classification

NLP

GLUE Benchmark

GLUE Benchmark

NLP

Punctuation and Capitalization

Punctuation and Capitalization

NLP

Entity Linking

Entity Linking

NLP

Named Entity Recognition - BioMegatron

Named Entity Recognition - BioMegatron

NLP

Relation Extraction - BioMegatron

Relation Extraction - BioMegatron

NLP

P-Tuning/Prompt-Tuning

P-Tuning/Prompt-Tuning

TTS

Speech Synthesis

TTS Inference

TTS

Speech Synthesis

FastPitch Duration and Pitch Control

TTS

Speech Synthesis

Tacotron2 Training

TTS

Speech Synthesis

TalkNet Training

TTS

Speech Synthesis

FastPitch Fine-Tuning

Tools

CTC Segmentation

CTC Segmentation

Text Processing

Text Normalization and Inverse Normalization for ASR and TTS

Text Normalization

Text Processing

Inverse Text Normalization for ASR - Thutmose Tagger

Inverse Text Normalization with Thutmose Tagger

Text Processing

Constructing Normalization Grammars with WFSTs

WFST Tutorial