Canary#
Canary is a multilingual encoder-decoder model that supports transcription and translation.
Note
The Canary model supports offline mode only.
For client installation and sample audio instructions, refer to the Deploy and Run ASR Models page.
Deploy the NIM Container#
export CONTAINER_ID=canary-1b
export NIM_TAGS_SELECTOR="mode=ofl"
docker run -it --rm --name=$CONTAINER_ID \
--runtime=nvidia \
--gpus '"device=0"' \
--shm-size=8GB \
-e NGC_API_KEY \
-e NIM_HTTP_API_PORT=9000 \
-e NIM_GRPC_API_PORT=50051 \
-p 9000:9000 \
-p 50051:50051 \
-e NIM_TAGS_SELECTOR \
nvcr.io/nim/nvidia/$CONTAINER_ID:latest
For additional profile options, refer to the ASR support matrix.
Run Inference#
Transcription#
Canary supports transcription in the following languages: ar-AR, cs-CZ, da-DK, de-DE, en-GB, en-US, es-ES, es-US, fr-CA, fr-FR, he-IL, hi-IN, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, nn-NO, pl-PL, pt-BR, pt-PT, ru-RU, sv-SE, th-TH, tr-TR, zh-CN. Specifying the input language is required. The Canary model has punctuation enabled by default.
Copy a sample audio file from the NIM container or use your own.
docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .
Ensure the NIM with the Canary model is deployed.
python3 python-clients/scripts/asr/transcribe_file_offline.py \
--server 0.0.0.0:50051 \
--list-models
python3 python-clients/scripts/asr/transcribe_file_offline.py \
--server 0.0.0.0:50051 \
--language en-US --input-file en-US_sample.wav
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=en-US \
-F file="@en-US_sample.wav"
Translation#
Canary supports translation from en-US to ar-AR, bg-BG, cs-CZ, da-DK, de-DE, el-GR, en-US, et-EE, fi-FI, fr-FR, hi-IN, hr-HR, hu-HU, id-ID, it-IT, ja-JP, ko-KR, lt-LT, lv-LV, nb-NO, nl-NL, pl-PL, pt-BR, pt-PT, ro-RO, ru-RU, sk-SK, sl-SI, sv-SE, th-TH, tr-TR, uk-UA, vi-VN, zh-CN, and from ar-AR, cs-CZ, da-DK, de-DE, es-ES, es-US, fr-CA, fr-FR, he-IL, hi-IN, it-IT, ja-JP, ko-KR, nb-NO, nl-NL, nn-NO, pl-PL, pt-BR, pt-PT, ru-RU, sv-SE, tr-TR, zh-CN to en-US.
Copy sample audio files from the NIM container or use your own.
docker cp $CONTAINER_ID:/opt/riva/wav/fr-FR_sample.wav .
docker cp $CONTAINER_ID:/opt/riva/examples/asr_lib/1272-135031-0000.wav .
Translation to English#
python3 python-clients/scripts/asr/transcribe_file_offline.py \
--server 0.0.0.0:50051 \
--language fr-FR --input-file fr-FR_sample.wav \
--custom-configuration target_language:en-US,task:translate
curl -s http://0.0.0.0:9000/v1/audio/translations -F language=fr-FR \
-F target_language=en-US -F file="@fr-FR_sample.wav"
Translation from English#
python3 python-clients/scripts/asr/transcribe_file_offline.py \
--server 0.0.0.0:50051 \
--language en-US --input-file 1272-135031-0000.wav \
--custom-configuration target_language:fr-FR,task:translate
curl -s http://0.0.0.0:9000/v1/audio/translations -F language=en-US \
-F target_language=fr-FR -F file="@1272-135031-0000.wav"