Speech Data Explorer
====================

.. note::

    The tool could be found under `NeMo/tools/speech_data_explorer <https://github.com/NVIDIA/NeMo/tree/main/tools/speech_data_explorer>`__.

`Dash <https://plotly.com/dash/>`__-based tool for interactive exploration of ASR/TTS datasets.

Features:

* dataset's statistics (alphabet, vocabulary, duration-based histograms)
* navigation across dataset (sorting, filtering)
* inspection of individual utterances (waveform, spectrogram, audio player)
* errors' analysis (Word Error Rate, Character Error Rate, Word Match Rate, Mean Word Accuracy, diff)

Please make sure that requirements are installed. Then run:

.. code::

    python data_explorer.py path_to_manifest.json


JSON manifest file should contain the following fields:

* `audio_filepath` (path to audio file)
* `duration` (duration of the audio file in seconds)
* `text` (reference transcript)

Errors' analysis requires "pred_text" (ASR transcript) for all utterances.

Any additional field will be parsed and displayed in 'Samples' tab.