bridge.models.conversion.transformers_compat#
Compatibility utilities for HuggingFace transformers 5.0+ configs.
Module Contents#
Functions#
Extract rope_theta from a HuggingFace config. |
|
Extract rope_local_base_freq from a HuggingFace config. |
|
Extract rope scaling factor from a HuggingFace config. |
|
Extract the full-attention interval from a Qwen3-Next-style HuggingFace config. |
API#
- bridge.models.conversion.transformers_compat.rope_theta_from_hf(config) float#
Extract rope_theta from a HuggingFace config.
This utility method handles the extraction of rope_theta (rotary position embedding base frequency) from HuggingFace configs, supporting both the legacy format (direct rope_theta attribute) and the new transformers 5.0+ format (rope_parameters dictionary).
- Parameters:
config – HuggingFace configuration object.
- Returns:
The rope_theta value for rotary embeddings.
- Return type:
float
- Raises:
ValueError – If rope_theta is not found in either format.
- bridge.models.conversion.transformers_compat.rope_local_base_freq_from_hf(config) float#
Extract rope_local_base_freq from a HuggingFace config.
Similar to rope_theta_from_hf but for the local base frequency parameter used by some models (e.g., Gemma3).
- Parameters:
config – HuggingFace configuration object.
- Returns:
The rope_local_base_freq value.
- Return type:
float
- Raises:
ValueError – If rope_local_base_freq is not found in either format.
- bridge.models.conversion.transformers_compat.rope_scaling_factor_from_hf(config, default: float = 1.0) float#
Extract rope scaling factor from a HuggingFace config.
This utility method handles the extraction of the rope scaling factor from HuggingFace configs, supporting both the legacy format (rope_scaling dict) and the new transformers 5.0+ format (rope_parameters dictionary).
- Parameters:
config – HuggingFace configuration object.
default – Default value to return if no scaling factor is found.
- Returns:
The rope scaling factor value, or default if not found.
- Return type:
float
- bridge.models.conversion.transformers_compat.full_attention_interval_from_hf(config, default: int = 4) int#
Extract the full-attention interval from a Qwen3-Next-style HuggingFace config.
In transformers <5.5 the interval was stored directly as
config.full_attention_interval. In transformers >=5.5 the field was removed; the kwarg is consumed in__post_init__and converted intoconfig.layer_types(a list whosei-th entry is"linear_attention"or"full_attention"according to(i + 1) % interval). This helper handles both layouts.- Parameters:
config – HuggingFace configuration object (e.g.
Qwen3NextConfig).default – Value to return if neither layout is present.
- Returns:
The interval at which standard attention layers appear.
- Return type:
int