nemo_automodel.components.speculative.dspark.core

DSpark online training wrapper.

The DSpark draft is self-contained: it samples anchors, builds the block attention mask, runs the semi-autoregressive backbone + Markov head, and emits everything the objective needs. This module is therefore a thin wrapper that calls the draft with the target supervision and computes the three-term loss.

Module Contents

Classes

Name	Description
`DSparkStepMetrics`	Per-step training outputs for the DSpark draft (loss + its three terms).
`DSparkTrainerModule`	DSpark online training wrapper computing the three-term objective.

Data

__all__

API

class nemo_automodel.components.speculative.dspark.core.DSparkStepMetrics(
    loss: torch.Tensor,
    ce_loss: torch.Tensor,
    l1_loss: torch.Tensor,
    confidence_loss: torch.Tensor
)

Dataclass

Per-step training outputs for the DSpark draft (loss + its three terms).

ce_loss

Tensor

confidence_loss

Tensor

l1_loss

Tensor

loss

Tensor

class nemo_automodel.components.speculative.dspark.core.DSparkTrainerModule(
    draft_model: nemo_automodel.components.speculative.dspark.draft_qwen3.Qwen3DSparkModel,
    loss_decay_gamma: typing.Optional[float] = None,
    ce_loss_alpha: float = 0.1,
    l1_loss_alpha: float = 0.9,
    confidence_head_alpha: float = 1.0
)

Bases: Module

DSpark online training wrapper computing the three-term objective.

ce_loss_alpha

= float(ce_loss_alpha)

confidence_head_alpha

= float(confidence_head_alpha)

l1_loss_alpha

= float(l1_loss_alpha)

nemo_automodel.components.speculative.dspark.core.DSparkTrainerModule.forward(
    input_ids: torch.Tensor,
    target_hidden_states: torch.Tensor,
    loss_mask: torch.Tensor,
    target_last_hidden_states: typing.Optional[torch.Tensor] = None
) -> nemo_automodel.components.speculative.dspark.core.DSparkStepMetrics

Run the draft on the target supervision and compute the DSpark loss.

nemo_automodel.components.speculative.dspark.core.__all__ = ['DSparkTrainerModule', 'DSparkStepMetrics']