Models

Currenlty NeMo’s Speaker Diarization pipeline uses MarbleNet model for Voice Activity Detection (VAD) and SpeakerNet model for Speaker Embedding Extraction.