Nemotron ASR Streaming#

The Nemotron ASR Streaming model supports streaming speech-to-text transcription in English only.

For client installation and sample audio instructions, refer to the Deploy and Run ASR Models page.

Deploy the NIM Container#

export CONTAINER_ID=nemotron-asr-streaming
export NIM_TAGS_SELECTOR="mode=str"

docker run -it --rm --name=$CONTAINER_ID \
  --runtime=nvidia \
  --gpus '"device=0"' \
  --shm-size=8GB \
  -e NGC_API_KEY \
  -e NIM_HTTP_API_PORT=9000 \
  -e NIM_GRPC_API_PORT=50051 \
  -p 9000:9000 \
  -p 50051:50051 \
  -e NIM_TAGS_SELECTOR \
  nvcr.io/nim/nvidia/$CONTAINER_ID:latest

For additional profile options, refer to the ASR support matrix.

Run Inference#

Copy a sample audio file from the NIM container or use your own.

docker cp $CONTAINER_ID:/opt/riva/wav/en-US_sample.wav .

Streaming (gRPC)#

Ensure the NIM with streaming mode is deployed.

python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --list-models

The input speech file is streamed to the service chunk-by-chunk.

python3 python-clients/scripts/asr/transcribe_file.py \
  --server 0.0.0.0:50051 \
  --language-code en-US --automatic-punctuation \
  --input-file en-US_sample.wav

Realtime API#

python3 python-clients/scripts/asr/realtime_asr_client.py \
  --server 0.0.0.0:9000 \
  --language-code en-US --automatic-punctuation \
  --input-file en-US_sample.wav

Note

The Nemotron ASR Streaming model supports streaming mode only.