Command-line Clients#

Data Center

The Riva Speech Docker image contains sample command-line drivers for the Riva Speech AI services. Pull and run the container by using the following commands. The client expects that a Riva server is running with models deployed, and all command-line drivers accept an optional argument to specify the location of the server. No GPU is required to run the sample clients.

docker pull nvcr.io/nvidia/riva/riva-speech:2.8.0
docker run -it --rm nvcr.io/nvidia/riva/riva-speech:2.8.0

Embedded

The sample command-line clients are present in the Riva Speech AI server container. Refer to the Quick Start Guide for steps on how to launch the Riva Speech AI server container.

Speech Recognition#

Both binary and Python clients are included in the Docker image with options to perform ASR inference in streaming as well as offline (nonstreaming) mode.

Binary Streaming Example#

Run the ASR streaming client, where --audio_file specifies the audio file that is to be transcribed. Other options can be found in riva_streaming_asr_client --help.

riva_streaming_asr_client --audio_file /opt/riva/wav/test/1272-135031-0001.wav

Binary Offline/Batch (nonstreaming) Example#

Run the binary ASR offline client, where --audio_file specifies the audio file that is to be transcribed. Other options can be found in riva_asr_client --help.

riva_asr_client --audio_file /opt/riva/wav/test/1272-135031-0001.wav

Python Streaming Example#

Run the Python ASR streaming client, where --input-file specifies the audio file that is to be transcribed. Other options can be found in the riva_streaming_asr_client.py script.

python riva_streaming_asr_client.py --input-file=/opt/riva/wav/test/1272-135031-0001.wav

The transcribe_mic.py example is the preferred, cross-platform way to interact with Riva. Additionally, using a microphone from an arbitrary remote system running on any operating system and hardware platform is recommended.

python transcribe_mic.py --input-device <device_id>

Where the <device_id> of your audio input device can be obtained from:

python transcribe_mic.py --list-devices

Other Python ASR clients are also available.

  1. transcribe_file.py transcribes audio files in streaming mode.

  2. transcribe_file_offline.py transcribes audio files in offline mode. You can play audio while transcribing using PyAudio.

Binary Offline Speaker Diarization Example#

Run the binary ASR offline client, where --audio_file specifies the audio file to be diarized and --speaker_diarization is the flag to enable diarization.

riva_asr_client --audio_file /opt/riva/wav/en-US_sample.wav --speaker_diarization=true

Python Offline Speaker Diarization Example#

Run the Python ASR offline client, where --input-file specifies the audio file to be diarized and --speaker-diarization is the flag to enable diarization.

python transcribe_file_offline.py --input-file /opt/riva/wav/en-US_sample.wav --speaker-diarization

Natural Language Processing#

Both binary and Python clients are included in the Docker image for supported NLP tasks.

Binary NER Example#

Run the Token Classification Model (NER), where --queries specifies the query file in which to provide token labels for. Other options can be found in riva_nlp_classify_tokens --help.

riva_nlp_classify_tokens --queries=/work/test_files/nlu/queries.txt

Binary QA Example#

Run the binary Question Answering client, where --questions specifies the file containing questions and --contexts specifies the file containing the context paragraph. Other options can be found in riva_nlp_qa --help.

riva_nlp_qa --questions=/work/test_files/nlu/qa_questions.txt --contexts=/work/test_files/nlu/qa_contexts.txt

Python QA Example#

Run the Python Question Answering client. Provide --context and --query in the test_qa.py script to try different Question Answering examples.

python qa_client.py

Python Intent Slot Example#

Run the Python Intent Slot Classification client, where --query specifies the query string.

python intentslot_client.py --query "your query" --model riva_intent_weather

Binary Punctuation Example#

Run the Punctuation client, where --queries specifies the file containing inputs to punctuate. Note that if you want to specify --output to write output of the punctuation model to a file, you need to also specify --parallel_requests=1.

riva_nlp_punct --queries=/work/test_files/nlu/punctuation_input.txt

Speech Synthesis#

Binary clients are supported in the Docker image for TTS.

Binary TTS Client Example#

Run the binary TTS client, where --text is used to specify the input text for which to synthesize audio. The output of the client is a .wav file, which can be specified by --audio_file. Other options can be found in riva_tts_client --help.

riva_tts_client --text="I had a dream yesterday." --audio_file=/opt/riva/wav/output.wav

Binary TTS Performance Client Example#

Run the binary TTS performance client, which provides information about latency and throughput. Options --text specifies the input text and --text_file specifies the file containing multiple text inputs. Other options can be found in riva_tts_perf_client --help.

riva_tts_perf_client --text_file=/work/test_files/tts/ljs_audio_text_test_filelist_small.txt

Python Client Examples#

The talk.py script is an implementation of a client for performing offline and streaming (with audio being streamed back to the client in chunks) inference.

python talk.py --stream --output-device <device_id>

where the <device_id> of your audio output device can be obtained from:

python talk.py --list-devices