Release Notes#

Release 1.4.0#

Updated Parakeet 1.1b CTC English (en-US) NIM with a new model profile for telephony use cases
Added support to configure Silero VAD parameters for each request
Improved partial transcripts

Canary models may produce incorrect translations for certain languages such as Spanish (es-ES) and Korean (ko-KR).
RMIR model format deployment is supported only on GPUs with compute capability <= 9.0.
ASR HTTP API (Whisper) accepts only file, model, and language parameters in the request. All other parameters are ignored.
ASR gRPC API (Whisper)
- Supports only the offline Recognize API.
- Does not support customization parameters (e.g. profanity_filter, enable_word_time_offsets, enable_automatic_punctuation, verbatim_transcripts) in the RecognitionConfig message.
Diarizer functionality is not supported in this release.

New ASR models
- Parakeet 1.1b RNNT Multilingual
Parakeet 1.1b CTC en-US model is now available as a separate NIM container. Also, a new profile with Silero VAD has been added to provide improved noise robustness and accurate end-of-speech detection.

Added a size-optimized container for Parakeet 0.6b CTC en-US ASR NIM with WSL support.
Added a profile with lower GPU memory requirement for Parakeet 0.6b CTC en-US model with a batch size of 1 for edge deployment.

New ASR models
- Canary 1b Multilingual
- Canary 0.6b Turbo Multilingual
- Whisper Large v3
- Conformer Spanish (es-US)
Simplified deployment with auto profile selection based on hardware.
Support for deployment in WSL2 environment

Canary models may produce incorrect translation output for certain languages like es-ES and ko-KR
Deployment with RMIR model format is supported only on GPUs with compute capability <= 9.0.
ASR HTTP API for Whisper only accepts file, model and language parameters in the request. Other parameters are ignored.
ASR gRPC API for Whisper only supports Offline Recognize API.
ASR gRPC API for Whisper does not support customization parameters (e.g. profanity_filter, enable_word_time_offsets, enable_automatic_punctuation, verbatim_transcripts) in RecognitionConfig message.
Diarizer function is not supported in this release.

New ASR models
- Conformer-CTC Spanish (es-US)
- Whisper Large v3 with support for Multilingual Transcription and Translation to English

ASR HTTP API for Whisper only accepts file, model and language parameters in the request. Other parameters are ignored.
ASR gRPC API for Whisper only supports Offline Recognize API.
ASR gRPC API for Whisper does not support customization parameters (e.g. profanity_filter, enable_word_time_offsets, enable_automatic_punctuation, verbatim_transcripts) in RecognitionConfig message.

This is the first general release of NVIDIA NIM for Riva, featuring support for the following models: