About NVIDIA NIM for Visual Generative AI#

NVIDIA NIM for Visual Generative AI enables you to run the most popular visual generative AI models in the world and run the models optimally.

With this NVIDIA NIM, you can use text to describe an image and the model generates an image based on your text.

For example, you might provide the following description: “realistic futuristic city-downtown with short buildings, sunset” and the service produces an image like the following example:

Enterprise-Ready Features#

NIM abstracts away model inference internals such as execution engine and runtime operations. NVIDIA NIM for Visual GenAI offers the following enterprise-ready features:

High Performance NIM is optimized for high-performance deep learning inference with NVIDIA TensorRT and NVIDIA Triton Inference Server.

Scalable Deployment that is performant and can quickly and seamlessly scale from a few users to millions.

Enterprise-Grade Security emphasizes security by constantly monitoring and patching CVEs in our stack and conducting internal penetration tests.

Architecture#

NVIDIA NIM microservices are packaged as container images. Each image works with a specific model, such as black-forest-labs/flux.1-dev.

The containers can run on any NVIDIA GPU with sufficient GPU memory. The model is not packaged as part of the image, but can be downloaded automatically from NVIDIA NGC. NVIDIA produces several model profiles that are optimized for popular data-center GPU models. Refer to the Support Matrix for more information. However, you can optimize the model for your GPU model if NVIDIA does not produce a model profile for your GPU.

Because the container can download the model from NGC, NVIDIA recommends caching the model on local storage to improve container startup times. In addition, because each container is built from common base images, downloading additional containers or newer images is faster than the initial download.

A security scan report is available for each container. The scan report provides a security rating of that image, a breakdown of CVE severity by package, and links to detailed information on CVEs.

Supported models#

Model name	Description
FLUX.1-Dev	This model provides high-quality results, and the Depth and Canny variants offer additional enhancements for output-image guidance. Choose this model when you want top quality and robust image-to-image options.
FLUX.1-Schnell	This model uses a distilled model that achieves similar quality in fewer steps. Because it runs at top speed, you can experiment quickly. It is tailored for local development and personal use, and you can access it freely under the Apache 2.0 license.