Parakeet CTC Spanish (es-US)#

Deploy the Parakeet CTC Spanish (es-US) model as a NIM container and run streaming or offline transcription.

Deploy the NIM Container#

The following command deploys the Parakeet CTC Spanish (es-US) model with the mode=all inference mode, which enables both streaming and offline inference.

export CONTAINER_ID=parakeet-ctc-0.6b-es
export NIM_TAGS_SELECTOR="mode=all,vad=default,diarizer=disabled"

docker run -it --rm --name=$CONTAINER_ID \
  --runtime=nvidia \
  --gpus '"device=0"' \
  --shm-size=8GB \
  -e NGC_API_KEY \
  -e NIM_HTTP_API_PORT=9000 \
  -e NIM_GRPC_API_PORT=50051 \
  -p 9000:9000 \
  -p 50051:50051 \
  -e NIM_TAGS_SELECTOR \
  nvcr.io/nim/nvidia/$CONTAINER_ID:latest

For additional profile options, refer to the ASR support matrix.

Prepare a Sample Audio File#

To list the sample audio files bundled in the container, run the following command:

docker exec $CONTAINER_ID ls /opt/riva/wav/

This should return a list of sample audio files. For example:

es-US_sample.wav

To copy a sample audio file to your local machine, run the following command:

docker cp $CONTAINER_ID:/opt/riva/wav/es-US_sample.wav .

Run Inference#

Run inference on the sample audio file in streaming or offline mode.

Streaming#

Ensure the NIM is deployed with a streaming mode model. Verify by running:

python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --list-models

You should refer to a model with streaming in the name. The input speech file is streamed to the service chunk-by-chunk.

python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --language-code es-US --automatic-punctuation \
  --input-file es-US_sample.wav
python3 python-clients/scripts/asr/realtime_asr_client.py \
  --server 0.0.0.0:9000 \
  --language-code es-US --automatic-punctuation \
  --input-file es-US_sample.wav

Offline#

Ensure the NIM is deployed with an offline mode model. Verify by running:

python3 python-clients/scripts/asr/transcribe_file_offline.py \
  --server 0.0.0.0:50051 \
  --list-models

You should refer to a model with offline in the name. The input speech file is sent to the service in one shot.

python3 python-clients/scripts/asr/transcribe_file_offline.py \
  --server 0.0.0.0:50051 \
  --language-code es-US --automatic-punctuation \
  --input-file es-US_sample.wav
curl -s http://0.0.0.0:9000/v1/audio/transcriptions -F language=es-US \
  -F file="@es-US_sample.wav"