Falcon - NVIDIA Docs

NVIDIA Docs Hub NVIDIA NeMo Framework User Guide Falcon

Released in April 2023, TII’s Falcon is an Apache 2.0 license model based on the transformer decoder framework with key adjustments such as using multi-group attention, RoPE, parallel attention and MLP blocks, and removal of bias from linear layers. More information is available in the companion paper: “The Falcon Series of Open Language Models”. Falcon is offered in sizes 7B, 40B, and 180B.

Feature	Status
Data parallelism	✓
Tensor parallelism	✓
Pipeline parallelism	✓
Interleaved Pipeline Parallelism Sched	N/A
Sequence parallelism	✓
Selective activation checkpointing	✓
Gradient checkpointing	✓
Partial gradient checkpointing	✓
FP32/TF32	✓
AMP/FP16	✗
BF16	✓
TransformerEngine/FP8	✗
Multi-GPU	✓
Multi-Node	✓
Inference	N/A
Slurm	✓
Base Command Manager	✓
Base Command Platform	✓
Distributed data preprcessing	✓
NVfuser	✗
P-Tuning and Prompt Tuning	✓
IA3 and Adapter learning	✓
Distributed Optimizer	✓
Distributed Checkpoint	✓
Fully Shared Data Parallel	N/A

Previous Model Deployment

Next Data Preparation