Release Notes#

Release 1.5.0#

Key Features and Enhancements#

Known Issues#

  • The Whisper Large v3 NIM does not support NVIDIA B200 GPUs.

  • The Whisper Large v3 NIM uses OpenAI’s Whisper model, which may result in hallucinations, inaccuracies, and incorrect inverse text normalizations in some cases.

  • The profanity filter may not work for non-English languages in some cases.

  • The Parakeet 0.6b TDT model uses chunked inference and can sometimes produce incorrect transcripts compared to the NeMo model. Refer to Model Accuracy for details.

  • Canary models may produce incorrect translations for certain languages such as Spanish (es-ES) and Korean (ko-KR).

  • ASR HTTP API (Whisper) accepts only file, model, and language parameters in the request. All other parameters are ignored.

  • ASR gRPC API (Whisper)

    • Supports only the offline Recognize API.

    • Does not support customization parameters (e.g. profanity_filter, enable_word_time_offsets, enable_automatic_punctuation, verbatim_transcripts) in the RecognitionConfig message.

  • SSL/TLS encryption does not support TTS APIs. MTLS encryption is not supported in this release.