Ministral3 (Bidirectional) for Embedding

View as Markdown

NeMo AutoModel provides a bidirectional variant of Mistral AI’s Ministral3 for embedding and dense retrieval tasks. Unlike the standard causal (left-to-right) Ministral3 used for text generation, this variant uses bidirectional attention, so each token can attend to both past and future tokens in the sequence, producing richer representations for semantic similarity and dense retrieval.

The bidirectional encoder can be loaded directly from text-only checkpoints (e.g. mistralai/Ministral-3B-Instruct) and also automatically extracts the language model from Ministral3 VLM checkpoints (e.g. mistralai/Ministral-3-3B-Base-2512 or mistralai/Ministral-3-3B-Instruct-2512).

TasksEmbedding, Dense Retrieval
ArchitectureMinistral3BidirectionalModel
Parameters3B
HF Orgmistralai

Available Models

Any Ministral3 checkpoint can be loaded as a bidirectional backbone. The following configurations are tested:

  • Ministral-3-3B-Base-2512 — VLM checkpoint, language model is extracted automatically
  • Ministral-3-3B-Instruct-2512 — VLM checkpoint, language model is extracted automatically

Embedding Models

The bidirectional bi-encoder path is used for embedding generation and dense retrieval.

ArchitectureTaskAuto ClassDescription
Ministral3BidirectionalModelEmbeddingNeMoAutoModelBiEncoderBidirectional Ministral3 with mean pooling for dense embeddings

Pooling Strategies

The bi-encoder supports multiple pooling strategies to aggregate token representations into a single embedding vector:

StrategyDescription
avgAverage of all token hidden states (default)
clsFirst token hidden state
lastLast non-padding token hidden state
weighted_avgWeighted average of token hidden states

Example HF Models

ModelHF ID
Ministral-3 3B Basemistralai/Ministral-3-3B-Base-2512
Ministral-3 3B Instructmistralai/Ministral-3-3B-Instruct-2512

Try with NeMo AutoModel

1. Install NeMo AutoModel. Refer to the (Installation Guide) for information:

$uv pip install nemo-automodel

2. Clone the repo to get the example recipes:

$git clone https://github.com/NVIDIA-NeMo/Automodel.git
$cd Automodel

3. Run the recipe from inside the repo (point any Llama bi-encoder recipe at a Ministral3 checkpoint, or write a recipe targeting mistralai/Ministral-3-3B-Base-2512):

$torchrun --nproc-per-node=8 examples/retrieval/bi_encoder/finetune.py --config examples/retrieval/bi_encoder/llama3_2_1b.yaml
$torchrun --nproc-per-node=8 examples/retrieval/bi_encoder/finetune.py --config examples/retrieval/bi_encoder/ministral3_3b_instruct.yaml

See the Installation Guide.

Hugging Face Model Cards