Parakeet RNNT#

Parakeet 1.1b RNNT Multilingual model supports streaming speech-to-text transcription in multiple languages. The model identifies the spoken language and provides the transcript in the corresponding language.

For client installation and sample audio instructions, refer to the Deploy and Run ASR Models page.

Deploy the NIM Container#

export CONTAINER_ID=parakeet-1-1b-rnnt-multilingual
export NIM_TAGS_SELECTOR="mode=all,diarizer=disabled"

docker run -it --rm --name=$CONTAINER_ID \
  --runtime=nvidia \
  --gpus '"device=0"' \
  --shm-size=8GB \
  -e NGC_API_KEY \
  -e NIM_HTTP_API_PORT=9000 \
  -e NIM_GRPC_API_PORT=50051 \
  -p 9000:9000 \
  -p 50051:50051 \
  -e NIM_TAGS_SELECTOR \
  nvcr.io/nim/nvidia/$CONTAINER_ID:latest

For additional profile options, refer to the ASR support matrix.

Run Inference#

Copy sample audio files from the NIM container or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .
docker cp $CONTAINER_ID:/opt/riva/wav/fr-FR_sample.wav .

Streaming#

Ensure the NIM is deployed with a streaming mode model.

python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --list-models

The input speech file is streamed to the service chunk-by-chunk.

# Transcribe English speech
python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --language-code multi --automatic-punctuation \
  --input-file en-US_sample.wav

# Transcribe French speech
python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --language-code multi --automatic-punctuation \
  --input-file fr-FR_sample.wav

Offline#

Ensure the NIM is deployed with an offline mode model.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
  --server 0.0.0.0:50051 \
  --list-models

The input speech file is sent to the service in one shot.

# Transcribe English speech
python3 python-clients/scripts/asr/transcribe_file_offline.py \
  --server 0.0.0.0:50051 \
  --language-code multi --automatic-punctuation \
  --input-file en-US_sample.wav

# Transcribe French speech
python3 python-clients/scripts/asr/transcribe_file_offline.py \
  --server 0.0.0.0:50051 \
  --language-code multi --automatic-punctuation \
  --input-file fr-FR_sample.wav
# Transcribe English speech
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=multi \
  -F file="@en-US_sample.wav"

# Transcribe French speech
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=multi \
  -F file="@fr-FR_sample.wav"
# Transcribe English speech
python3 python-clients/scripts/asr/realtime_asr_client.py \
  --server 0.0.0.0:9000 \
  --language-code multi --automatic-punctuation \
  --input-file en-US_sample.wav

# Transcribe French speech
python3 python-clients/scripts/asr/realtime_asr_client.py \
  --server 0.0.0.0:9000 \
  --language-code multi --automatic-punctuation \
  --input-file fr-FR_sample.wav