nemo_automodel.components.speculative.dspark.core

View as Markdown

DSpark online training wrapper.

The DSpark draft is self-contained: it samples anchors, builds the block attention mask, runs the semi-autoregressive backbone + Markov head, and emits everything the objective needs. This module is therefore a thin wrapper that calls the draft with the target supervision and computes the three-term loss.

Module Contents

Classes

NameDescription
DSparkStepMetricsPer-step training outputs for the DSpark draft (loss + its three terms).
DSparkTrainerModuleDSpark online training wrapper computing the three-term objective.

Data

__all__

API

class nemo_automodel.components.speculative.dspark.core.DSparkStepMetrics(
loss: torch.Tensor,
ce_loss: torch.Tensor,
l1_loss: torch.Tensor,
confidence_loss: torch.Tensor
)
Dataclass

Per-step training outputs for the DSpark draft (loss + its three terms).

ce_loss
Tensor
confidence_loss
Tensor
l1_loss
Tensor
loss
Tensor
class nemo_automodel.components.speculative.dspark.core.DSparkTrainerModule(
draft_model: nemo_automodel.components.speculative.dspark.draft_qwen3.Qwen3DSparkModel,
loss_decay_gamma: typing.Optional[float] = None,
ce_loss_alpha: float = 0.1,
l1_loss_alpha: float = 0.9,
confidence_head_alpha: float = 1.0
)

Bases: Module

DSpark online training wrapper computing the three-term objective.

ce_loss_alpha
= float(ce_loss_alpha)
confidence_head_alpha
= float(confidence_head_alpha)
l1_loss_alpha
= float(l1_loss_alpha)
nemo_automodel.components.speculative.dspark.core.DSparkTrainerModule.forward(
input_ids: torch.Tensor,
target_hidden_states: torch.Tensor,
loss_mask: torch.Tensor,
target_last_hidden_states: typing.Optional[torch.Tensor] = None
) -> nemo_automodel.components.speculative.dspark.core.DSparkStepMetrics

Run the draft on the target supervision and compute the DSpark loss.

nemo_automodel.components.speculative.dspark.core.__all__ = ['DSparkTrainerModule', 'DSparkStepMetrics']