Models#
About Model Profiles#
The models for NVIDIA NIM microservices use model engines that are tuned for specific NVIDIA GPU models, number of GPUs, precision, and so on. NVIDIA produces model engines for several popular combinations and these are referred to as model profiles. Each model profile is identified by a unique 64-character string of hexadecimal digits that is referred to as a profile ID.
The available model profiles are stored in a file in the NIM container file system.
The file is referred to as the model manifest file and the default path is /opt/nim/etc/default/model_manifest.yaml
in the container.
FLUX.1-dev Model Profiles#
FLUX.1-dev is a collection of generative image AI models creating high quality, realistic images. FLUX.1-dev generates images from simple text prompts, while FLUX.1-Depth-dev and FLUX.1-Canny-dev enable greater control by combining the text prompt with an image input to guide the output image structure.
GPU |
Backend |
Resolution |
Variant |
Precision |
Model Profile ID |
---|---|---|---|---|---|
GeForce RTX 5090 (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
1b2d236d5fa4e0425e80ff17c9480ed73f2a66a5190a102299b3c9b8936670ff |
GeForce RTX 5090 (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
8b42564dd5dc5dc021b47027fc25e8de3c3f20541b06643b80143facd338480b |
GeForce RTX 5090 (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
66188a8ebcad93374ef35c7fb89df3db16ea9176aee3515ad1a4d333d9fc8676 |
GeForce RTX 5090 (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
b44b6dbfc4d414f5b2d11c401606380d616939bf4f9470de78b9e25de6f143e3 |
GeForce RTX 5090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
13c5b586b0dc4a9478c9812c38209453d13e72200cdc14a2796666b6adc54dca |
GeForce RTX 5090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
7145b304ffec84de388bb116224db971ebd72c5b6dbd0f897e5b8b18b527ce3a |
GeForce RTX 5090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
5587d816d9f9f241d2c2cb582decd808ef3fb8a125609278965a870719fc509d |
GeForce RTX 5090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
62ca080d1980cbb17478cfb7c7bc48031a8d931ad8d4a70ca3971da545160282 |
GeForce RTX 5080 (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
365d6883d978bb2f2c00f5af2678115e0d92c2d09f1fe4f8bcdd813b8d731a5f |
GeForce RTX 5080 (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
36c44753a9a188e8a36e717c4cd2d08c7c8cc4281f59c750cfda49bd9e72a0bf |
GeForce RTX 5080 (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
387b0d749f1f6c39f7dd9b57e1e6872f809c6bf0422c71cda164be32c0fb7d79 |
GeForce RTX 5080 (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
b454d497c90956b1bf546720c1df00c1888865050d72290191f36ada319ecc6c |
GeForce RTX 5080 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
9ad7ffac9b8260d15ab286637d444363a6899159e903e9cce3594a58be1489f9 |
GeForce RTX 5080 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
c8cfa63ee8cba592b3f52edefa18a5fda9e8f512ee3da8bc938a90336a0e75ea |
GeForce RTX 5080 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
3bdeb471bb31950b3a7a759b5dea3aeb80083fd328a2cee445463fcf79141373 |
GeForce RTX 5080 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
b912befc35951aa88e450d4b0ff7ec9576688c44f434cec624d814f954b16c10 |
GeForce RTX 5070TI (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
34f18766736a8842a8248cffc18881bf850d04698ab1e25fa7e9fc65fae82688 |
GeForce RTX 5070TI (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
d65f03f4d849fd152ce78961bec9868652db607f7e7f8d02eeea68de9e964cfc |
GeForce RTX 5070TI (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
48e32cc14e07205437fa4484893e017fe6ce7149de6ef3b935e61482cc43d3e7 |
GeForce RTX 5070TI (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
094a53dd3d6b4a67e8ba8b215f996acb0f0114afc8b1a2503068ebd7e2dc4b67 |
GeForce RTX 4090 (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP8 |
9a5a75332be4d894e9d663e76a90c7fbcd8eef45b98e0ad575ff7eac581ec89c |
GeForce RTX 4090 (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP8 |
1968070ac84e762080833782908552598dfbb2e81c3d8997c4feecefe605e57f |
GeForce RTX 4090 (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP8 |
73d59547d8a73efc756b3e8fabedc5b129f6986e811e88d2f025dc5af4b084fe |
GeForce RTX 4090 (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP8 |
c556479d91c84a3f6df68bbdc2e13edaab457cd38a5d65a16c154059609c2095 |
GeForce RTX 4090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP8 |
93bd95c143ef54b2ad47c47ac3d5742f9fff5ff80baa8228ce19a8577a75ebc8 |
GeForce RTX 4090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP8 |
75af532c3d833d82ad27fab8bb190f60fbb3a91b0cf70bea33d294a7c8ce5baf |
GeForce RTX 4090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP8 |
bdf9998149e94cdaf5221aa9baebc30f27449925da8eac6d1508cd945cdb643a |
GeForce RTX 4090 Laptop (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP8 |
61070e912036a7a9140d5e64126bd623522293ea3076c2f167d963a94c863b13 |
GeForce RTX 4080 (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP8 |
96295541130ed46e3de0b25d1a95e7409784bb19da7b4b97cb586eac4e4ab778 |
GeForce RTX 4080 (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP8 |
842dee095df8d6ad5a2b8678605e677fea46882ff1eba1ecde76a186e8b0d1c5 |
GeForce RTX 4080 (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP8 |
b490222762872588294023feaecc384bbba054ae06256abd7b166d5e007cb764 |
GeForce RTX 4080 (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP8 |
2a0ff19006f215b4dbb2266240c12d91ac6a005c402124c3ca3916141096fd0a |
NVIDIA RTX 6000 Ada Generation (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP8 |
9ed3f545f2316939af1984fb703115e9b706f8c7e9b4eb452f37f86a06df5bbc |
NVIDIA RTX 6000 Ada Generation (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP8 |
cc19715f2bd209a45773ec4131c346b4c88b44d3e8f67145e719d63f6bf512d4 |
NVIDIA RTX 6000 Ada Generation (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP8 |
a02d1b01eb43980224ebc91a471d415be2886849bce69374e9c2a63289d8debe |
NVIDIA RTX 6000 Ada Generation (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP8 |
cf766e0c4e718cccf1e771e27d7bb8181120ea21219533f8d9d166f1df1bbedd |
GeForce RTX 5090D (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP4 |
c6d1fad563e06a49946adfa773b9117b0485ec7cd0640386f0a5884bb350a51a |
GeForce RTX 5090D (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP4 |
a1f563c2ce47feeff632d0306083ad45e05d268cffb080a34caf5f2ed14ebbcc |
GeForce RTX 5090D (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP4 |
9b45f1c8bb44d13e6d6067799e90f472001845bd76bbe4da9669214deda62eda |
GeForce RTX 5090D (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP4 |
d4ffdd037cbdb279689bc6f5cd969de4cdf2e63b47edc055413b759cc25bdcff |
GeForce RTX 4090D (Beta) |
TensorRT |
768-1344x768-1344 |
base |
FP8 |
ea9115a32e460d58aa89e79baee8fa1668305d5a74558d81ebfddb41a2fb3c28 |
GeForce RTX 4090D (Beta) |
TensorRT |
768-1344x768-1344 |
canny |
FP8 |
9cb28f018271bcb561d19c9d95363bcbe2581c1a4af618e7256f37e930b9e034 |
GeForce RTX 4090D (Beta) |
TensorRT |
768-1344x768-1344 |
depth |
FP8 |
cc002253373b12b9a6e43172f61732187cf5b4665be018acfd93150b14f0d50b |
GeForce RTX 4090D (Beta) |
TensorRT |
768-1344x768-1344 |
base+canny+depth |
FP8 |
7368ed71d08315b9b8ee7fccfe0606a9755c03fd8043551d1c30d8a06207d1f9 |
If your GPU model is not listed, you can use the below generic model profiles with Pytorch backend or create the model profile for your GPU using these instructions. Pytorch checkpoints are not quantized so they consume more GPU memory.
GPU |
Backend |
Resolution |
Variant |
Precision |
Model Profile ID |
---|---|---|---|---|---|
Generic |
PyTorch |
768-1344x768-1344 |
base |
BF16 |
f0d0d4ac2ea5b121defa3e82a1fe82f289856cf5db49aa99e670e8851d8f0305 |
Generic |
PyTorch |
768-1344x768-1344 |
canny |
BF16 |
351a04dd6ca4e445f1ae4fe0da0190133c79ed4eedd2965e5da41cbb2b48826c |
Generic |
PyTorch |
768-1344x768-1344 |
depth |
BF16 |
7280cf728c45505c1a8def558d9c18534096c0fe9a976b138818e31b33e859b7 |
Generic |
PyTorch |
768-1344x768-1344 |
base+canny+depth |
BF16 |
f02c296542632aef64d11cbb13026c2502da2c290cc5b05f507a4922eedd1dda |