Mistral Models#

This page provides detailed technical specifications for the Mistral model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.

Mistral-7B-Instruct-v0.3#

Property	Value
Creator	Mistral AI
Architecture	transformer
Description	Mistral-7B-Instruct-v0.3 is an instruction-tuned model optimized for dialogue and instruction-following tasks.
Max I/O Tokens	4096
Parameters	7 billion
Training Data	Not specified
Default Name	mistralai/Mistral-7B-Instruct-v0.3
HuggingFace	mistralai/Mistral-7B-Instruct-v0.3

Training Options#

LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 1x 80GB GPU, tensor parallel size 1

Deployment Configuration#

LoRA:
- NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
- GPU Count: 1x 80GB
Full SFT:
- NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
- GPU Count: 1x 80GB
- Additional Environment Variables:
  - NIM_MODEL_PROFILE: vllm

Ministral-3-3B-Instruct-2512#

Property	Value
Creator	Mistral AI
Architecture	transformer
Description	Ministral-3-3B-Instruct-2512 is a compact instruction-tuned model from Mistral AI designed for efficient deployment.
Max I/O Tokens	4096
Parameters	3 billion
Training Data	Not specified
Default Name	mistralai/Ministral-3-3B-Instruct-2512
HuggingFace	mistralai/Ministral-3-3B-Instruct-2512

Training Options#

LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 2x 80GB GPU, tensor parallel size 1

Note

Deployment using NIM is not supported for this model.

Ministral-3-3B-Reasoning-2512#

Property	Value
Creator	Mistral AI
Architecture	transformer
Description	Ministral-3-3B-Reasoning-2512 is a compact model from Mistral AI optimized for reasoning tasks.
Max I/O Tokens	4096
Parameters	3 billion
Training Data	Not specified
Default Name	mistralai/Ministral-3-3B-Reasoning-2512
HuggingFace	mistralai/Ministral-3-3B-Reasoning-2512

Training Options#

LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 2x 80GB GPU, tensor parallel size 1

Note

Deployment using NIM is not supported for this model.