Mistral Models#
This page provides detailed technical specifications for the Mistral model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.
Mistral-7B-Instruct-v0.3#
Property |
Value |
|---|---|
Creator |
Mistral AI |
Architecture |
transformer |
Description |
Mistral-7B-Instruct-v0.3 is an instruction-tuned model optimized for dialogue and instruction-following tasks. |
Max I/O Tokens |
4096 |
Parameters |
7 billion |
Training Data |
Not specified |
Default Name |
mistralai/Mistral-7B-Instruct-v0.3 |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 1x 80GB GPU, tensor parallel size 1
Deployment Configuration#
LoRA:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Full SFT:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE:vllm
Ministral-3-3B-Instruct-2512#
Property |
Value |
|---|---|
Creator |
Mistral AI |
Architecture |
transformer |
Description |
Ministral-3-3B-Instruct-2512 is a compact instruction-tuned model from Mistral AI designed for efficient deployment. |
Max I/O Tokens |
4096 |
Parameters |
3 billion |
Training Data |
Not specified |
Default Name |
mistralai/Ministral-3-3B-Instruct-2512 |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 2x 80GB GPU, tensor parallel size 1
Note
Deployment using NIM is not supported for this model.
Ministral-3-3B-Reasoning-2512#
Property |
Value |
|---|---|
Creator |
Mistral AI |
Architecture |
transformer |
Description |
Ministral-3-3B-Reasoning-2512 is a compact model from Mistral AI optimized for reasoning tasks. |
Max I/O Tokens |
4096 |
Parameters |
3 billion |
Training Data |
Not specified |
Default Name |
mistralai/Ministral-3-3B-Reasoning-2512 |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 2x 80GB GPU, tensor parallel size 1
Note
Deployment using NIM is not supported for this model.