Models ====== Currenlty NeMo's Speaker Diarization pipeline uses `MarbleNet <../speech_classification/models.html#marblenet-vad>`__ model for Voice Activity Detection (VAD) and `SpeakerNet <../speaker_recognition/models.html#speakernet>`__ model for Speaker Embedding Extraction.