Skip to main content
Ctrl+K
Logo image

Getting Started

  • Introduction
  • Tutorials
  • Best Practices
  • Migration guide to use lightning 2.0

NeMo Core

  • NeMo Models
  • Experiment Manager
  • Neural Types
  • Exporting NeMo Models
  • Adapters
    • Adapter Components
    • Adapters API
  • Core APIs

Speech Processing

  • Automatic Speech Recognition (ASR)
    • Models
    • Datasets
    • ASR Language Modeling
    • Checkpoints
    • Scores
    • NeMo ASR Configuration Files
    • NeMo ASR collection API
    • Example: Kinyarwanda ASR using Mozilla Common Voice Dataset
  • Speech Classification
    • Models
    • Datasets
    • Checkpoints
    • NeMo Speech Classification Configuration Files
    • Resource and Documentation Guide
  • Speaker Recognition (SR)
    • Models
    • NeMo Speaker Recognition Configuration Files
    • Datasets
    • Checkpoints
    • NeMo Speaker Recognition API
    • Resource and Documentation Guide
  • Speaker Diarization
    • Models
    • Datasets
    • Checkpoints
    • NeMo Speaker Diarization Configuration Files
    • NeMo Speaker Diarization API
    • Resource and Documentation Guide
  • Self-Supervised Learning
    • Models
    • Datasets
    • Checkpoints
    • NeMo SSL Configuration Files
    • NeMo SSL collection API
    • Resources and Documentation
  • Speech Intent Classification and Slot Filling
    • Models
    • Datasets
    • Checkpoints
    • NeMo Speech Intent Classification and Slot Filling Configuration Files
    • NeMo Speech Intent Classification and Slot Filling collection API
    • Resources and Documentation

Natural Language Processing

  • NeMo Megatron
    • Migrating from Megatron-LM
    • GPT model training
    • Batching
    • Parallelisms
    • Prompt Learning
    • NeMo RETRO Model
    • Hiddens Module
    • Parameter-Efficient Fine-Tuning (PEFT)
      • Quick Start Guide
      • Supported PEFT methods
    • Flash attention
    • Positional embeddings
  • Machine Translation Models
  • (Inverse) Text Normalization
    • WFST-based (Inverse) Text Normalization
      • Text (Inverse) Normalization
      • Grammar customization
      • Deploy to Production with C++ backend
      • Resources and Documentation
    • Neural Models for (Inverse) Text Normalization
      • Neural Text Normalization Models
      • Thutmose Tagger: Single-pass Tagger-based ITN Model
  • NeMo Megatron API
    • ONNX Export of Megatron Models
  • ONNX Export of Megatron Models
  • Tasks
    • Punctuation And Capitalization Models
      • Punctuation and Capitalization Model
      • Punctuation and Capitalization Lexical Audio Model
    • SpellMapper (Spellchecking ASR Customization) Model
    • Token Classification (Named Entity Recognition) Model
    • Joint Intent and Slot Classification
    • Text Classification model
    • BERT
    • Language Modeling
    • Prompt Learning
    • Question Answering
    • Dialogue tasks
    • GLUE Benchmark
    • Information Retrieval
    • Entity Linking
    • Model NLP
    • Machine Translation Models

Text To Speech (TTS)

  • Text-to-Speech (TTS)
    • Models
    • Data Preprocessing
    • Checkpoints
    • NeMo TTS Configuration Files
    • NeMo TTS Collection API
    • Resources and Documentation
    • Grapheme-to-Phoneme Models

Text Processing

  • Common Collection
    • Callbacks
    • Losses
    • Metrics
    • Tokenizers
    • Data

Tools

  • Tools
    • NeMo Forced Aligner (NFA)
    • Dataset Creation Tool Based on CTC-Segmentation
    • Speech Data Explorer
    • Comparison tool for ASR Models
    • ASR Evaluator
    • Speech Data Processor
  • .rst

Speech Data Processor

Speech Data Processor#

Speech Data Processor (SDP) is a toolkit to make it easy to:
  1. write code to process a new dataset, minimizing the amount of boilerplate code required.

  2. share the steps for processing a speech dataset.

SDP is hosted here: NVIDIA/NeMo-speech-data-processor.

To learn more about SDP, please check the [documentation](https://nvidia.github.io/NeMo-speech-data-processor/).

previous

ASR Evaluator

By NVIDIA CORPORATION

© Copyright © 2021-2023 NVIDIA Corporation & Affiliates. All rights reserved..

Last updated on Dec 01, 2023.