Models#

About Model Profiles#

The models for NVIDIA NIM microservices use model engines that are tuned for specific NVIDIA GPU models, number of GPUs, precision, and so on. NVIDIA produces model engines for several popular combinations and these are referred to as model profiles. Each model profile is identified by a unique 64-character string of hexadecimal digits that is referred to as a profile ID.

The available model profiles are stored in a file in the NIM container file system. The file is referred to as the model manifest file and the default path is /opt/nim/etc/default/model_manifest.yaml in the container.

FLUX.1-dev Model Profiles#

FLUX.1-dev is a collection of generative image AI models creating high quality, realistic images. FLUX.1-dev generates images from simple text prompts, while FLUX.1-Depth-dev and FLUX.1-Canny-dev enable greater control by combining the text prompt with an image input to guide the output image structure.

GPU

Backend

Resolution

Variant

Precision

Model Profile ID

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

base

FP4

1b2d236d5fa4e0425e80ff17c9480ed73f2a66a5190a102299b3c9b8936670ff

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

canny

FP4

8b42564dd5dc5dc021b47027fc25e8de3c3f20541b06643b80143facd338480b

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

depth

FP4

66188a8ebcad93374ef35c7fb89df3db16ea9176aee3515ad1a4d333d9fc8676

GeForce RTX 5090 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

b44b6dbfc4d414f5b2d11c401606380d616939bf4f9470de78b9e25de6f143e3

GeForce RTX 5090 Laptop (Beta)

TensorRT

768-1344x768-1344

base

FP4

13c5b586b0dc4a9478c9812c38209453d13e72200cdc14a2796666b6adc54dca

GeForce RTX 5090 Laptop (Beta)

TensorRT

768-1344x768-1344

canny

FP4

7145b304ffec84de388bb116224db971ebd72c5b6dbd0f897e5b8b18b527ce3a

GeForce RTX 5090 Laptop (Beta)

TensorRT

768-1344x768-1344

depth

FP4

5587d816d9f9f241d2c2cb582decd808ef3fb8a125609278965a870719fc509d

GeForce RTX 5090 Laptop (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

62ca080d1980cbb17478cfb7c7bc48031a8d931ad8d4a70ca3971da545160282

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

base

FP4

365d6883d978bb2f2c00f5af2678115e0d92c2d09f1fe4f8bcdd813b8d731a5f

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

canny

FP4

36c44753a9a188e8a36e717c4cd2d08c7c8cc4281f59c750cfda49bd9e72a0bf

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

depth

FP4

387b0d749f1f6c39f7dd9b57e1e6872f809c6bf0422c71cda164be32c0fb7d79

GeForce RTX 5080 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

b454d497c90956b1bf546720c1df00c1888865050d72290191f36ada319ecc6c

GeForce RTX 5080 Laptop (Beta)

TensorRT

768-1344x768-1344

base

FP4

9ad7ffac9b8260d15ab286637d444363a6899159e903e9cce3594a58be1489f9

GeForce RTX 5080 Laptop (Beta)

TensorRT

768-1344x768-1344

canny

FP4

c8cfa63ee8cba592b3f52edefa18a5fda9e8f512ee3da8bc938a90336a0e75ea

GeForce RTX 5080 Laptop (Beta)

TensorRT

768-1344x768-1344

depth

FP4

3bdeb471bb31950b3a7a759b5dea3aeb80083fd328a2cee445463fcf79141373

GeForce RTX 5080 Laptop (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

b912befc35951aa88e450d4b0ff7ec9576688c44f434cec624d814f954b16c10

GeForce RTX 5070TI (Beta)

TensorRT

768-1344x768-1344

base

FP4

34f18766736a8842a8248cffc18881bf850d04698ab1e25fa7e9fc65fae82688

GeForce RTX 5070TI (Beta)

TensorRT

768-1344x768-1344

canny

FP4

d65f03f4d849fd152ce78961bec9868652db607f7e7f8d02eeea68de9e964cfc

GeForce RTX 5070TI (Beta)

TensorRT

768-1344x768-1344

depth

FP4

48e32cc14e07205437fa4484893e017fe6ce7149de6ef3b935e61482cc43d3e7

GeForce RTX 5070TI (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

094a53dd3d6b4a67e8ba8b215f996acb0f0114afc8b1a2503068ebd7e2dc4b67

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

base

FP8

9a5a75332be4d894e9d663e76a90c7fbcd8eef45b98e0ad575ff7eac581ec89c

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

canny

FP8

1968070ac84e762080833782908552598dfbb2e81c3d8997c4feecefe605e57f

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

depth

FP8

73d59547d8a73efc756b3e8fabedc5b129f6986e811e88d2f025dc5af4b084fe

GeForce RTX 4090 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

c556479d91c84a3f6df68bbdc2e13edaab457cd38a5d65a16c154059609c2095

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

base

FP8

93bd95c143ef54b2ad47c47ac3d5742f9fff5ff80baa8228ce19a8577a75ebc8

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

canny

FP8

75af532c3d833d82ad27fab8bb190f60fbb3a91b0cf70bea33d294a7c8ce5baf

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

depth

FP8

bdf9998149e94cdaf5221aa9baebc30f27449925da8eac6d1508cd945cdb643a

GeForce RTX 4090 Laptop (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

61070e912036a7a9140d5e64126bd623522293ea3076c2f167d963a94c863b13

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

base

FP8

96295541130ed46e3de0b25d1a95e7409784bb19da7b4b97cb586eac4e4ab778

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

canny

FP8

842dee095df8d6ad5a2b8678605e677fea46882ff1eba1ecde76a186e8b0d1c5

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

depth

FP8

b490222762872588294023feaecc384bbba054ae06256abd7b166d5e007cb764

GeForce RTX 4080 (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

2a0ff19006f215b4dbb2266240c12d91ac6a005c402124c3ca3916141096fd0a

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

base

FP8

9ed3f545f2316939af1984fb703115e9b706f8c7e9b4eb452f37f86a06df5bbc

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

canny

FP8

cc19715f2bd209a45773ec4131c346b4c88b44d3e8f67145e719d63f6bf512d4

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

depth

FP8

a02d1b01eb43980224ebc91a471d415be2886849bce69374e9c2a63289d8debe

NVIDIA RTX 6000 Ada Generation (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

cf766e0c4e718cccf1e771e27d7bb8181120ea21219533f8d9d166f1df1bbedd

GeForce RTX 5090D (Beta)

TensorRT

768-1344x768-1344

base

FP4

c6d1fad563e06a49946adfa773b9117b0485ec7cd0640386f0a5884bb350a51a

GeForce RTX 5090D (Beta)

TensorRT

768-1344x768-1344

canny

FP4

a1f563c2ce47feeff632d0306083ad45e05d268cffb080a34caf5f2ed14ebbcc

GeForce RTX 5090D (Beta)

TensorRT

768-1344x768-1344

depth

FP4

9b45f1c8bb44d13e6d6067799e90f472001845bd76bbe4da9669214deda62eda

GeForce RTX 5090D (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP4

d4ffdd037cbdb279689bc6f5cd969de4cdf2e63b47edc055413b759cc25bdcff

GeForce RTX 4090D (Beta)

TensorRT

768-1344x768-1344

base

FP8

ea9115a32e460d58aa89e79baee8fa1668305d5a74558d81ebfddb41a2fb3c28

GeForce RTX 4090D (Beta)

TensorRT

768-1344x768-1344

canny

FP8

9cb28f018271bcb561d19c9d95363bcbe2581c1a4af618e7256f37e930b9e034

GeForce RTX 4090D (Beta)

TensorRT

768-1344x768-1344

depth

FP8

cc002253373b12b9a6e43172f61732187cf5b4665be018acfd93150b14f0d50b

GeForce RTX 4090D (Beta)

TensorRT

768-1344x768-1344

base+canny+depth

FP8

7368ed71d08315b9b8ee7fccfe0606a9755c03fd8043551d1c30d8a06207d1f9

If your GPU model is not listed, you can use the below generic model profiles with Pytorch backend or create the model profile for your GPU using these instructions. Pytorch checkpoints are not quantized so they consume more GPU memory.

GPU

Backend

Resolution

Variant

Precision

Model Profile ID

Generic

PyTorch

768-1344x768-1344

base

BF16

f0d0d4ac2ea5b121defa3e82a1fe82f289856cf5db49aa99e670e8851d8f0305

Generic

PyTorch

768-1344x768-1344

canny

BF16

351a04dd6ca4e445f1ae4fe0da0190133c79ed4eedd2965e5da41cbb2b48826c

Generic

PyTorch

768-1344x768-1344

depth

BF16

7280cf728c45505c1a8def558d9c18534096c0fe9a976b138818e31b33e859b7

Generic

PyTorch

768-1344x768-1344

base+canny+depth

BF16

f02c296542632aef64d11cbb13026c2502da2c290cc5b05f507a4922eedd1dda