nemo_automodel.components.models.deepseek_v3.rope_utils

View as Markdown

Module Contents

Functions

NameDescription
apply_rotary_embApplies rotary positional embeddings to the input tensor.
apply_rotary_emb_qk-
freqs_cis_from_position_ids-
precompute_freqs_cisPrecomputes frequency-based complex exponential values for rotary positional embeddings.
yarn_get_mscale-

API

nemo_automodel.components.models.deepseek_v3.rope_utils.apply_rotary_emb(
x: torch.Tensor,
freqs_cis: torch.Tensor,
qkv_format: str = 'bshd',
unsqueeze_dim: int | None = None
) -> torch.Tensor

Applies rotary positional embeddings to the input tensor.

Parameters:

x
torch.Tensor

Input tensor with positional embeddings to be applied.

freqs_cis
torch.Tensor

Precomputed complex exponential values for positional embeddings.

Returns: torch.Tensor

torch.Tensor: Tensor with rotary embeddings applied.

nemo_automodel.components.models.deepseek_v3.rope_utils.apply_rotary_emb_qk(
q: torch.Tensor,
k: torch.Tensor,
freqs_cis: torch.Tensor,
format: str = 'bshd',
rope_fusion: bool = True,
cu_seqlens: torch.Tensor | None = None,
cp_size: int = 1,
cp_rank: int = 0
) -> tuple[torch.Tensor, torch.Tensor]
nemo_automodel.components.models.deepseek_v3.rope_utils.freqs_cis_from_position_ids(
position_ids: torch.Tensor,
freqs: torch.Tensor,
qkv_format: str = 'bshd',
for_fused_rope: bool = False,
cp_size: int = 1
) -> torch.Tensor
nemo_automodel.components.models.deepseek_v3.rope_utils.precompute_freqs_cis(
qk_rope_head_dim: int,
max_seq_len: int,
rope_theta: float,
rope_scaling: dict[str, float | int] | None
) -> torch.Tensor

Precomputes frequency-based complex exponential values for rotary positional embeddings.

Parameters:

qk_rope_head_dim
int

Dimensionality of the rotary positional embeddings.

max_seq_len
int

Maximum sequence length.

original_seq_len
int

Original sequence length.

beta_fast
int

Fast beta value for the exponential computation.

beta_slow
int

Slow beta value for the exponential computation.

rope_theta
float

Base value for the exponential computation.

rope_factor
float

Factor value for the exponential computation.

Returns: torch.Tensor

torch.Tensor: Precomputed complex exponential values for positional embeddings.

nemo_automodel.components.models.deepseek_v3.rope_utils.yarn_get_mscale(
scale = 1,
mscale = 1
)