bridge.models.nemotronh.nemotron_h_provider
#
Module Contents#
Classes#
Configuration for Nemotron-H models. |
|
Configuration for a 4B parameter Nemotron-H model. |
|
Configuration for a 8B parameter Nemotron-H model. |
|
Configuration for a 47B parameter Nemotron-H model. |
|
Configuration for a 56B parameter Nemotron-H model. |
|
Configuration for a 9B parameter Nemotron Nano v2 model. |
|
Configuration for the Nemotron Nano v2 12B model. |
|
Deprecated alias for |
|
Deprecated alias for |
|
Deprecated alias for |
|
Deprecated alias for |
|
Deprecated alias for |
|
Deprecated alias for |
Functions#
Data#
API#
- bridge.models.nemotronh.nemotron_h_provider.logger#
‘getLogger(…)’
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider#
Bases:
megatron.bridge.models.mamba.mamba_provider.MambaModelProvider
Configuration for Nemotron-H models.
- seq_length: int#
8192
- mamba_num_groups: int#
8
- mamba_head_dim: int#
64
- num_query_groups: int#
8
- make_vocab_size_divisible_by: int#
128
- activation_func: callable#
None
- masked_softmax_fusion: bool#
True
- apply_query_key_layer_scaling: bool#
False
- persist_layer_norm: bool#
True
- attention_softmax_in_fp32: bool#
False
- first_last_layers_bf16: bool#
True
- is_hybrid_model: bool#
True
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider4B#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for a 4B parameter Nemotron-H model.
- hybrid_override_pattern: str#
‘M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M-’
- num_layers: int#
52
3072
- mamba_num_heads: int#
112
- kv_channels: int#
128
- mamba_state_dim: int#
128
12288
- num_attention_heads: int#
32
- use_mamba_mem_eff_path: bool#
False
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider8B#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for a 8B parameter Nemotron-H model.
- hybrid_override_pattern: str#
‘M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M-’
- num_layers: int#
52
4096
- mamba_state_dim: int#
128
- mamba_num_heads: int#
128
21504
- num_attention_heads: int#
32
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider47B#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for a 47B parameter Nemotron-H model.
- hybrid_override_pattern: str#
‘M-M-M-M-M-M-M-M-M*-M-M-M-M-M-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M-M-M—MM—M-M*-M-M-M-M-M-’
- num_layers: int#
98
8192
- mamba_state_dim: int#
256
- mamba_num_heads: int#
256
30720
- num_attention_heads: int#
64
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider56B#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for a 56B parameter Nemotron-H model.
- hybrid_override_pattern: str#
‘M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M-M-M-M*-M-M…’
- num_layers: int#
118
8192
- mamba_state_dim: int#
256
- mamba_num_heads: int#
256
32768
- num_attention_heads: int#
64
- class bridge.models.nemotronh.nemotron_h_provider.NemotronNanoModelProvider9Bv2#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for a 9B parameter Nemotron Nano v2 model.
- hybrid_override_pattern: str#
‘M-M-M-MM-M-M-M*-M-M-M*-M-M-M-M*-M-M-M-M*-M-MM-M-M-M-M-M-’
- num_layers: int#
56
4480
- mamba_num_heads: int#
128
- mamba_state_dim: int#
128
15680
- num_attention_heads: int#
40
- mamba_head_dim: int#
80
- seq_length: int#
131072
- class bridge.models.nemotronh.nemotron_h_provider.NemotronNanoModelProvider12Bv2#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider
Configuration for the Nemotron Nano v2 12B model.
- hybrid_override_pattern: str#
‘M-M-M-M*-M-M-M-M*-M-M-M-M*-M-M-M-M*-M-M-M-M*-M-M-M-M*-M-M-M-M-’
- num_layers: int#
62
5120
- mamba_num_heads: int#
128
- mamba_state_dim: int#
128
20480
- num_attention_heads: int#
40
- mamba_head_dim: int#
80
- seq_length: int#
131072
- bridge.models.nemotronh.nemotron_h_provider._warn_deprecated(old_cls: str, new_cls: str) None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModel4BProvider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider4B
Deprecated alias for
NemotronHModelProvider4B
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronHModelProvider4B
instead.- __post_init__() None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModel8BProvider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider8B
Deprecated alias for
NemotronHModelProvider8B
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronHModelProvider8B
instead.- __post_init__() None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModel47BProvider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider47B
Deprecated alias for
NemotronHModelProvider47B
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronHModelProvider47B
instead.- __post_init__() None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronHModel56BProvider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronHModelProvider56B
Deprecated alias for
NemotronHModelProvider56B
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronHModelProvider56B
instead.- __post_init__() None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronNano9Bv2Provider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronNanoModelProvider9Bv2
Deprecated alias for
NemotronNanoModelProvider9Bv2
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronNanoModelProvider9Bv2
instead.- __post_init__() None #
- class bridge.models.nemotronh.nemotron_h_provider.NemotronNano12Bv2Provider#
Bases:
bridge.models.nemotronh.nemotron_h_provider.NemotronNanoModelProvider12Bv2
Deprecated alias for
NemotronNanoModelProvider12Bv2
.Deprecated: This alias remains for backward compatibility and will be removed in a future release. Import and use
NemotronNanoModelProvider12Bv2
instead.- __post_init__() None #