bridge.recipes.nemotron_omni.nemotron_omni#
Nemotron Omni SFT/PEFT recipes (CORD v2 VL, Valor32k-AVQA audio-visual, temporal video).
All recipes use nemotron_omni_step (pass --step_func nemotron_omni_step).
Module Contents#
Functions#
Return a VL SFT config for Nemotron Omni on CORD v2. |
|
Return a LoRA PEFT config for Nemotron Omni on CORD v2. |
|
Shared model/training config for all Nemotron Omni recipes. |
|
Return an Energon SFT config with temporal video embedder enabled. |
|
LoRA PEFT recipe on temporal-video Energon path (temporal_patch_dim=2). |
Data#
API#
- bridge.recipes.nemotron_omni.nemotron_omni._DEFAULT_HF_PATH#
‘nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16’
- bridge.recipes.nemotron_omni.nemotron_omni.nemotron_omni_cord_v2_sft_config(
- hf_path: str = _DEFAULT_HF_PATH,
Return a VL SFT config for Nemotron Omni on CORD v2.
Vision-language finetuning on the CORD v2 receipt parsing dataset. Default configuration: 1 node, 8 GPUs (TP=4). Uses nemotron_omni_step (pass –step_func nemotron_omni_step).
- Parameters:
hf_path – HuggingFace model ID or local path to the Nemotron Omni model.
- bridge.recipes.nemotron_omni.nemotron_omni.nemotron_omni_cord_v2_peft_config(
- hf_path: str = _DEFAULT_HF_PATH,
Return a LoRA PEFT config for Nemotron Omni on CORD v2.
LoRA adapters are applied to language-model attention + Mamba projections. Vision encoder/projection and sound encoder/projection are frozen. Default configuration: 1 node, 8 GPUs (TP=4). Uses nemotron_omni_step (pass –step_func nemotron_omni_step).
- Parameters:
hf_path – HuggingFace model ID or local path to the Nemotron Omni model.
- bridge.recipes.nemotron_omni.nemotron_omni._nemotron_omni_base_config(
- hf_path: str = _DEFAULT_HF_PATH,
Shared model/training config for all Nemotron Omni recipes.
- bridge.recipes.nemotron_omni.nemotron_omni.nemotron_omni_valor32k_sft_config(
- hf_path: str = _DEFAULT_HF_PATH,
Return an Energon SFT config with temporal video embedder enabled.
Uses RADIO’s
separate_video_embedderto fuse temporal frame pairs (2 consecutive frames → 1 vision embedding) instead of discarding every other frame. Requiresdynamic_resolution=True. The shard path must be set via CLI override:dataset.path=<path>.Uses
nemotron_omni_step(pass--step_func nemotron_omni_step).- Parameters:
hf_path – HuggingFace model ID or local path to the Nemotron Omni model.
- bridge.recipes.nemotron_omni.nemotron_omni.nemotron_omni_valor32k_peft_config(
- hf_path: str = _DEFAULT_HF_PATH,
LoRA PEFT recipe on temporal-video Energon path (temporal_patch_dim=2).