`bridge.models.conversion.transformers_compat`#

Compatibility utilities for HuggingFace transformers 5.0+ configs.

Module Contents#

Functions#

`rope_theta_from_hf`	Extract rope_theta from a HuggingFace config.
`rope_local_base_freq_from_hf`	Extract rope_local_base_freq from a HuggingFace config.
`rope_scaling_factor_from_hf`	Extract rope scaling factor from a HuggingFace config.
`full_attention_interval_from_hf`	Extract the full-attention layout from a Qwen3-Next-style HuggingFace config.

API#

bridge.models.conversion.transformers_compat.rope_theta_from_hf(config) → float#

Extract rope_theta from a HuggingFace config.

This utility method handles the extraction of rope_theta (rotary position embedding base frequency) from HuggingFace configs, supporting both the legacy format (direct rope_theta attribute) and the new transformers 5.0+ format (rope_parameters dictionary).

Parameters:: config – HuggingFace configuration object.
Returns:: The rope_theta value for rotary embeddings.
Return type:: float
Raises:: ValueError – If rope_theta is not found in either format.

bridge.models.conversion.transformers_compat.rope_local_base_freq_from_hf(config) → float#

Extract rope_local_base_freq from a HuggingFace config.

Similar to rope_theta_from_hf but for the local base frequency parameter used by some models (e.g., Gemma3).

Parameters:: config – HuggingFace configuration object.
Returns:: The rope_local_base_freq value.
Return type:: float
Raises:: ValueError – If rope_local_base_freq is not found in either format.

bridge.models.conversion.transformers_compat.rope_scaling_factor_from_hf(config, default: float = 1.0) → float#

Extract rope scaling factor from a HuggingFace config.

This utility method handles the extraction of the rope scaling factor from HuggingFace configs, supporting both the legacy format (rope_scaling dict) and the new transformers 5.0+ format (rope_parameters dictionary).

Parameters:

config – HuggingFace configuration object.
default – Default value to return if no scaling factor is found.

Returns:

The rope scaling factor value, or default if not found.

Return type:

float

bridge.models.conversion.transformers_compat.full_attention_interval_from_hf( config, default: int = 4, ) → int | list[int]#

Extract the full-attention layout from a Qwen3-Next-style HuggingFace config.

In transformers <5.5 the interval was stored directly as config.full_attention_interval. In transformers >=5.5 the field was removed; the kwarg is consumed in __post_init__ and converted into config.layer_types (a list whose i-th entry is "linear_attention" or "full_attention" according to (i + 1) % interval). This helper handles both layouts.

Parameters:

config – HuggingFace configuration object (e.g. Qwen3NextConfig).
default – Value to return if neither layout is present.

Returns:

The legacy interval or the exact per-layer pattern, where 1 is linear attention and 0 is full attention.

Return type:

int | list[int]

Raises:

ValueError – If an explicit layer type is unsupported.

bridge.models.conversion.transformers_compat#

Module Contents#

Functions#

API#

`bridge.models.conversion.transformers_compat`#