<no title>

Parakeet CTC

en-US

Transcription using gRPC API

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .

Streaming mode example

Ensure NIM with streaming mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --list-models

Input speech file is streamed to the service chunk-by-chunk.

python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --language-code en-US --automatic-punctuation \
   --input-file en-US_sample.wav

Offline mode example

Ensure NIM offline mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Input speech file is sent to the service in one shot.

Transcription using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code en-US --automatic-punctuation \
   --input-file en-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=en \
   -F file="@en-US_sample.wav"

Parakeet TDT

en-US

Copy a sample audio file from the NIM container to the host machine, or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .

The Parakeet TDT NIM supports only the offline API.

Ensure that the NIM offline mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Transcription using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code en-US \
   --input-file en-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=en-US \
   -F file="@en-US_sample.wav"

Parakeet RNNT

Parakeet 1.1b RNNT Multilingual model supports streaming speech-to-text transcription in multiple languages. The model identifies the spoken language and provides the transcript corresponding to spoken language.

Transcription using gRPC API

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .
docker cp $CONTAINER_ID:/opt/riva/wav/fr-FR_sample.wav .

Streaming mode example

Ensure NIM with streaming mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --list-models

Input speech file is streamed to the service chunk-by-chunk.

# Transcribe English speech
python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --language-code multi --automatic-punctuation \
   --input-file en-US_sample.wav

# Transcribe French speech
python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --language-code multi --automatic-punctuation \
   --input-file fr-FR_sample.wav

Offline mode example

Ensure NIM offline mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Input speech file is sent to the service in one shot.

Transcription using gRPC and HTTP APIs

gRPC

# Transcribe English speech
python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code multi --automatic-punctuation \
   --input-file en-US_sample.wav

# Transcribe French speech
python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code multi --automatic-punctuation \
   --input-file fr-FR_sample.wav

HTTP

# Transcribe English speech
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=multi \
   -F file="@en-US_sample.wav"

# Transcribe French speech
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=multi \
   -F file="@fr-FR_sample.wav"

Conformer CTC

es-US

Transcription using gRPC API

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/es-US_sample.wav .

Streaming mode example

Ensure NIM with streaming mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --list-models

. Input speech file is streamed to the service chunk-by-chunk.

python3 python-clients/scripts/asr/transcribe_file.py \
   --server 0.0.0.0:50051 \
   --language-code es-US --automatic-punctuation \
   --input-file es-US_sample.wav

Offline mode example

Ensure NIM with offline mode model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Input speech file is sent to the service in one shot.

Transcription using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code es-US --automatic-punctuation \
   --input-file es-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=es \
   -F file="@es-US_sample.wav"

Whisper Large v3

Transcription

Whisper supports transcription in multiple languages. See Supported Languages for the list of all available languages and corresponding code. Specifying input language as multi will enable auto language detection. Specifying correct language is recommended as it will improve accuracy and latency. Whisper model has punctuation enabled by default.

Copy an example audio file from the NIM container to the or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .

Ensure NIM with Whisper Large v3 model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Transcription using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language en --input-file en-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=en \
   -F file="@en-US_sample.wav"

When the language code is not known beforehand, the language code multi can be passed. The model will predict the language for each 30 second chunk and return it to the client. The following command will print the transcript along with the predicted language.

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language-code multi \
   --input-file en-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=multi \
   -F file="@en-US_sample.wav"

Note

The Whisper model supports offline mode only.

Translation

Whisper supports translation from multiple languages to the English language. Refer to Supported Languages for the list of all available languages and corresponding code. Specifying the input language as multi enables auto language detection. Specifying the correct input language is recommended because it improves accuracy and latency.

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/fr-FR_sample.wav .

Translation using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language fr --input-file fr-FR_sample.wav \
   --custom-configuration task:translate

HTTP

curl -s http://0.0.0.0:9000/v1/audio/translations -F language=fr \
   -F file="@fr-FR_sample.wav"

Note

The Whisper model supports offline mode only.

Canary

Transcription

Canary supports transcription in en-US, en-GB, es-ES, ar-AR, es-US, pt-BR, fr-FR, de-DE, it-IT, ja-JP, ko-KR, ru-RU, hi-IN languages. Specifying the input language is required. The Canary model has punctuation enabled by default.

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .

Ensure NIM with the Canary model is deployed.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --list-models

Transcription using gRPC and HTTP APIs

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language en-US --input-file en-US_sample.wav

HTTP

curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=en-US \
   -F file="@en-US_sample.wav"

Note

The Canary model supports offline mode only.

Translation

Canary supports translation between en-US and es-ES, ar-AR, es-US, pt-BR, fr-FR, de-DE, it-IT, ja-JP, ko-KR, ru-RU, and hi-IN languages.

Copy an example audio file from the NIM container to the host machine or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/fr-FR_sample.wav .
docker cp $CONTAINER_ID:/opt/riva/examples/asr_lib/1272-135031-0000.wav .

Translation to English

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language fr-FR --input-file fr-FR_sample.wav \
   --custom-configuration target_language:en-US,task:translate

HTTP

curl -s http://0.0.0.0:9000/v1/audio/translations -F language=fr-FR \
   -F target_language=en-US -F file="@fr-FR_sample.wav"

Translation from English

gRPC

python3 python-clients/scripts/asr/transcribe_file_offline.py \
   --server 0.0.0.0:50051 \
   --language en-US --input-file 1272-135031-0000.wav \
   --custom-configuration target_language:fr-FR,task:translate

HTTP

curl -s http://0.0.0.0:9000/v1/audio/translations -F language=en-US \
   -F target_language=fr-FR -F file="@1272-135031-0000.wav"

Note

The Canary model supports offline mode only.