Phi-3 / Phi-4
Phi-3 / Phi-4
Phi-3 and Phi-4 are Microsoft’s high-capability small language models using a shared transformer decoder architecture (Phi3ForCausalLM). Phi-4-mini and Phi-4 achieve strong benchmark results at relatively small parameter counts.
Available Models
- Phi-4: 14B
- Phi-4-mini-instruct: 3.8B
- Phi-3.5-mini-instruct: 3.8B
- Phi-3-medium-128k-instruct: 14B
- Phi-3-mini-128k-instruct: 3.8B
- Phi-3-mini-4k-instruct: 3.8B
Architecture
Phi3ForCausalLM
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide.