Release Notes#

Release 1.1.0#

Summary#

This is the 1.1.0 release of NIM for VLMs.

Visual Language Models#

Limitations#

  • PEFT is not supported.

  • Following Meta’s guidance, function calling is not supported.

  • Following Meta’s guidance, only one image per request is supported.

  • Following Meta’s guidance, system messages are not allowed with images.

  • Following the official vLLM implementation, images are always added to the front of user messages.

  • Maximum concurrency can be low when using the vLLM backend.

  • Image and vision encoder Prometheus metrics are not available with the vLLM backend.

  • With context length larger than 32k, the accuracy of Llama-3.2-90B-Vision-Instruct can be degraded.

  • When deploying an optimized profile on AWS A10G, you might see [TensorRT-LLM][ERROR] ICudaEngine::createExecutionContextWithoutDeviceMemory: Error Code 1: Cuda Runtime (an illegal memory access was encountered). Use the vLLM backend instead as described here.