Models#

About Model Profiles#

The models for NVIDIA NIM microservices use model engines that are tuned for specific NVIDIA GPU models, number of GPUs, precision, and so on. NVIDIA produces model engines for several popular combinations and these are referred to as model profiles. Each model profile is identified by a unique 64-character string of hexadecimal digits that is referred to as a profile ID.

The available model profiles are stored in a file in the NIM container file system. The file is referred to as the model manifest file and the default path is /opt/nim/etc/default/model_manifest.yaml in the container.

FLUX.1-dev Model Profiles#

FLUX.1-dev is a collection of generative image AI models creating high quality, realistic images. FLUX.1-dev generates images from simple text prompts, while FLUX.1-Depth-dev and FLUX.1-Canny-dev enable greater control by combining the text prompt with an image input to guide the output image structure.

GPU

Backend

Resolution

Variant

Precision

Model Profile ID

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

base

FP4

1b2d236d5fa4e0425e80ff17c9480ed73f2a66a5190a102299b3c9b8936670ff

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

canny

FP4

8b42564dd5dc5dc021b47027fc25e8de3c3f20541b06643b80143facd338480b

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

depth

FP4

66188a8ebcad93374ef35c7fb89df3db16ea9176aee3515ad1a4d333d9fc8676

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

b44b6dbfc4d414f5b2d11c401606380d616939bf4f9470de78b9e25de6f143e3

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

base

FP4

365d6883d978bb2f2c00f5af2678115e0d92c2d09f1fe4f8bcdd813b8d731a5f

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

canny

FP4

36c44753a9a188e8a36e717c4cd2d08c7c8cc4281f59c750cfda49bd9e72a0bf

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

depth

FP4

387b0d749f1f6c39f7dd9b57e1e6872f809c6bf0422c71cda164be32c0fb7d79

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

b454d497c90956b1bf546720c1df00c1888865050d72290191f36ada319ecc6c

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

base

FP8

9a5a75332be4d894e9d663e76a90c7fbcd8eef45b98e0ad575ff7eac581ec89c

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

canny

FP8

1968070ac84e762080833782908552598dfbb2e81c3d8997c4feecefe605e57f

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

depth

FP8

73d59547d8a73efc756b3e8fabedc5b129f6986e811e88d2f025dc5af4b084fe

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

c556479d91c84a3f6df68bbdc2e13edaab457cd38a5d65a16c154059609c2095

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

base

FP8

93bd95c143ef54b2ad47c47ac3d5742f9fff5ff80baa8228ce19a8577a75ebc8

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

canny

FP8

75af532c3d833d82ad27fab8bb190f60fbb3a91b0cf70bea33d294a7c8ce5baf

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

depth

FP8

bdf9998149e94cdaf5221aa9baebc30f27449925da8eac6d1508cd945cdb643a

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

61070e912036a7a9140d5e64126bd623522293ea3076c2f167d963a94c863b13

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

base

FP8

96295541130ed46e3de0b25d1a95e7409784bb19da7b4b97cb586eac4e4ab778

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

canny

FP8

842dee095df8d6ad5a2b8678605e677fea46882ff1eba1ecde76a186e8b0d1c5

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

depth

FP8

b490222762872588294023feaecc384bbba054ae06256abd7b166d5e007cb764

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

2a0ff19006f215b4dbb2266240c12d91ac6a005c402124c3ca3916141096fd0a

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

base

FP8

9ed3f545f2316939af1984fb703115e9b706f8c7e9b4eb452f37f86a06df5bbc

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

canny

FP8

cc19715f2bd209a45773ec4131c346b4c88b44d3e8f67145e719d63f6bf512d4

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

depth

FP8

a02d1b01eb43980224ebc91a471d415be2886849bce69374e9c2a63289d8debe

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

cf766e0c4e718cccf1e771e27d7bb8181120ea21219533f8d9d166f1df1bbedd

If your GPU model is not listed, you can use the below generic model profiles with Pytorch backend or create the model profile for your GPU using these instructions. Pytorch checkpoints are not quantized so they consume more GPU memory.

GPU

Backend

Resolution

Variant

Precision

Model Profile ID

Generic

PyTorch

768-1344x768-1344

base

BF16

f0d0d4ac2ea5b121defa3e82a1fe82f289856cf5db49aa99e670e8851d8f0305

Generic

PyTorch

768-1344x768-1344

canny

BF16

351a04dd6ca4e445f1ae4fe0da0190133c79ed4eedd2965e5da41cbb2b48826c

Generic

PyTorch

768-1344x768-1344

depth

BF16

7280cf728c45505c1a8def558d9c18534096c0fe9a976b138818e31b33e859b7

Generic

PyTorch

768-1344x768-1344

base+canny+depth

BF16

f02c296542632aef64d11cbb13026c2502da2c290cc5b05f507a4922eedd1dda