Models#

About Model Profiles#

The models for NVIDIA NIM microservices use model engines that are tuned for specific NVIDIA GPU models, number of GPUs, precision, and so on. NVIDIA produces model engines for several popular combinations and these are referred to as model profiles. Each model profile is identified by a unique 64-character string of hexadecimal digits that is referred to as a profile ID.

The available model profiles are stored in a file in the NIM container file system. The file is referred to as the model manifest file and the default path is /opt/nim/etc/default/model_manifest.yaml in the container.

FLUX.1-dev Model Profiles#

FLUX.1-dev is a collection of generative image AI models creating high quality, realistic images. FLUX.1-dev generates images from simple text prompts, while FLUX.1-Depth-dev and FLUX.1-Canny-dev enable greater control by combining the text prompt with an image input to guide the output image structure.

GPU	Backend	Resolution	Variant	Precision	Model Profile ID
GeForce RTX 5090 (Beta)	TensorRT	768-1344x768-1344	base	FP4	1b2d236d5fa4e0425e80ff17c9480ed73f2a66a5190a102299b3c9b8936670ff
GeForce RTX 5090 (Beta)	TensorRT	768-1344x768-1344	canny	FP4	8b42564dd5dc5dc021b47027fc25e8de3c3f20541b06643b80143facd338480b
GeForce RTX 5090 (Beta)	TensorRT	768-1344x768-1344	depth	FP4	66188a8ebcad93374ef35c7fb89df3db16ea9176aee3515ad1a4d333d9fc8676
GeForce RTX 5090 (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	b44b6dbfc4d414f5b2d11c401606380d616939bf4f9470de78b9e25de6f143e3
GeForce RTX 5090 Laptop (Beta)	TensorRT	768-1344x768-1344	base	FP4	13c5b586b0dc4a9478c9812c38209453d13e72200cdc14a2796666b6adc54dca
GeForce RTX 5090 Laptop (Beta)	TensorRT	768-1344x768-1344	canny	FP4	7145b304ffec84de388bb116224db971ebd72c5b6dbd0f897e5b8b18b527ce3a
GeForce RTX 5090 Laptop (Beta)	TensorRT	768-1344x768-1344	depth	FP4	5587d816d9f9f241d2c2cb582decd808ef3fb8a125609278965a870719fc509d
GeForce RTX 5090 Laptop (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	62ca080d1980cbb17478cfb7c7bc48031a8d931ad8d4a70ca3971da545160282
GeForce RTX 5080 (Beta)	TensorRT	768-1344x768-1344	base	FP4	365d6883d978bb2f2c00f5af2678115e0d92c2d09f1fe4f8bcdd813b8d731a5f
GeForce RTX 5080 (Beta)	TensorRT	768-1344x768-1344	canny	FP4	36c44753a9a188e8a36e717c4cd2d08c7c8cc4281f59c750cfda49bd9e72a0bf
GeForce RTX 5080 (Beta)	TensorRT	768-1344x768-1344	depth	FP4	387b0d749f1f6c39f7dd9b57e1e6872f809c6bf0422c71cda164be32c0fb7d79
GeForce RTX 5080 (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	b454d497c90956b1bf546720c1df00c1888865050d72290191f36ada319ecc6c
GeForce RTX 5080 Laptop (Beta)	TensorRT	768-1344x768-1344	base	FP4	9ad7ffac9b8260d15ab286637d444363a6899159e903e9cce3594a58be1489f9
GeForce RTX 5080 Laptop (Beta)	TensorRT	768-1344x768-1344	canny	FP4	c8cfa63ee8cba592b3f52edefa18a5fda9e8f512ee3da8bc938a90336a0e75ea
GeForce RTX 5080 Laptop (Beta)	TensorRT	768-1344x768-1344	depth	FP4	3bdeb471bb31950b3a7a759b5dea3aeb80083fd328a2cee445463fcf79141373
GeForce RTX 5080 Laptop (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	b912befc35951aa88e450d4b0ff7ec9576688c44f434cec624d814f954b16c10
GeForce RTX 5070TI (Beta)	TensorRT	768-1344x768-1344	base	FP4	34f18766736a8842a8248cffc18881bf850d04698ab1e25fa7e9fc65fae82688
GeForce RTX 5070TI (Beta)	TensorRT	768-1344x768-1344	canny	FP4	d65f03f4d849fd152ce78961bec9868652db607f7e7f8d02eeea68de9e964cfc
GeForce RTX 5070TI (Beta)	TensorRT	768-1344x768-1344	depth	FP4	48e32cc14e07205437fa4484893e017fe6ce7149de6ef3b935e61482cc43d3e7
GeForce RTX 5070TI (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	094a53dd3d6b4a67e8ba8b215f996acb0f0114afc8b1a2503068ebd7e2dc4b67
GeForce RTX 4090 (Beta)	TensorRT	768-1344x768-1344	base	FP8	9a5a75332be4d894e9d663e76a90c7fbcd8eef45b98e0ad575ff7eac581ec89c
GeForce RTX 4090 (Beta)	TensorRT	768-1344x768-1344	canny	FP8	1968070ac84e762080833782908552598dfbb2e81c3d8997c4feecefe605e57f
GeForce RTX 4090 (Beta)	TensorRT	768-1344x768-1344	depth	FP8	73d59547d8a73efc756b3e8fabedc5b129f6986e811e88d2f025dc5af4b084fe
GeForce RTX 4090 (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP8	c556479d91c84a3f6df68bbdc2e13edaab457cd38a5d65a16c154059609c2095
GeForce RTX 4090 Laptop (Beta)	TensorRT	768-1344x768-1344	base	FP8	93bd95c143ef54b2ad47c47ac3d5742f9fff5ff80baa8228ce19a8577a75ebc8
GeForce RTX 4090 Laptop (Beta)	TensorRT	768-1344x768-1344	canny	FP8	75af532c3d833d82ad27fab8bb190f60fbb3a91b0cf70bea33d294a7c8ce5baf
GeForce RTX 4090 Laptop (Beta)	TensorRT	768-1344x768-1344	depth	FP8	bdf9998149e94cdaf5221aa9baebc30f27449925da8eac6d1508cd945cdb643a
GeForce RTX 4090 Laptop (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP8	61070e912036a7a9140d5e64126bd623522293ea3076c2f167d963a94c863b13
GeForce RTX 4080 (Beta)	TensorRT	768-1344x768-1344	base	FP8	96295541130ed46e3de0b25d1a95e7409784bb19da7b4b97cb586eac4e4ab778
GeForce RTX 4080 (Beta)	TensorRT	768-1344x768-1344	canny	FP8	842dee095df8d6ad5a2b8678605e677fea46882ff1eba1ecde76a186e8b0d1c5
GeForce RTX 4080 (Beta)	TensorRT	768-1344x768-1344	depth	FP8	b490222762872588294023feaecc384bbba054ae06256abd7b166d5e007cb764
GeForce RTX 4080 (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP8	2a0ff19006f215b4dbb2266240c12d91ac6a005c402124c3ca3916141096fd0a
NVIDIA RTX 6000 Ada Generation (Beta)	TensorRT	768-1344x768-1344	base	FP8	9ed3f545f2316939af1984fb703115e9b706f8c7e9b4eb452f37f86a06df5bbc
NVIDIA RTX 6000 Ada Generation (Beta)	TensorRT	768-1344x768-1344	canny	FP8	cc19715f2bd209a45773ec4131c346b4c88b44d3e8f67145e719d63f6bf512d4
NVIDIA RTX 6000 Ada Generation (Beta)	TensorRT	768-1344x768-1344	depth	FP8	a02d1b01eb43980224ebc91a471d415be2886849bce69374e9c2a63289d8debe
NVIDIA RTX 6000 Ada Generation (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP8	cf766e0c4e718cccf1e771e27d7bb8181120ea21219533f8d9d166f1df1bbedd
GeForce RTX 5090D (Beta)	TensorRT	768-1344x768-1344	base	FP4	c6d1fad563e06a49946adfa773b9117b0485ec7cd0640386f0a5884bb350a51a
GeForce RTX 5090D (Beta)	TensorRT	768-1344x768-1344	canny	FP4	a1f563c2ce47feeff632d0306083ad45e05d268cffb080a34caf5f2ed14ebbcc
GeForce RTX 5090D (Beta)	TensorRT	768-1344x768-1344	depth	FP4	9b45f1c8bb44d13e6d6067799e90f472001845bd76bbe4da9669214deda62eda
GeForce RTX 5090D (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP4	d4ffdd037cbdb279689bc6f5cd969de4cdf2e63b47edc055413b759cc25bdcff
GeForce RTX 4090D (Beta)	TensorRT	768-1344x768-1344	base	FP8	ea9115a32e460d58aa89e79baee8fa1668305d5a74558d81ebfddb41a2fb3c28
GeForce RTX 4090D (Beta)	TensorRT	768-1344x768-1344	canny	FP8	9cb28f018271bcb561d19c9d95363bcbe2581c1a4af618e7256f37e930b9e034
GeForce RTX 4090D (Beta)	TensorRT	768-1344x768-1344	depth	FP8	cc002253373b12b9a6e43172f61732187cf5b4665be018acfd93150b14f0d50b
GeForce RTX 4090D (Beta)	TensorRT	768-1344x768-1344	base+canny+depth	FP8	7368ed71d08315b9b8ee7fccfe0606a9755c03fd8043551d1c30d8a06207d1f9

If your GPU model is not listed, you can use the below generic model profiles with Pytorch backend or create the model profile for your GPU using these instructions. Pytorch checkpoints are not quantized so they consume more GPU memory.

GPU	Backend	Resolution	Variant	Precision	Model Profile ID
Generic	PyTorch	768-1344x768-1344	base	BF16	f0d0d4ac2ea5b121defa3e82a1fe82f289856cf5db49aa99e670e8851d8f0305
Generic	PyTorch	768-1344x768-1344	canny	BF16	351a04dd6ca4e445f1ae4fe0da0190133c79ed4eedd2965e5da41cbb2b48826c
Generic	PyTorch	768-1344x768-1344	depth	BF16	7280cf728c45505c1a8def558d9c18534096c0fe9a976b138818e31b33e859b7
Generic	PyTorch	768-1344x768-1344	base+canny+depth	BF16	f02c296542632aef64d11cbb13026c2502da2c290cc5b05f507a4922eedd1dda