nemo_automodel.components.speculative.dflash.core
nemo_automodel.components.speculative.dflash.core
DFlash online training wrapper.
Ported from SpecForge’s specforge/core/dflash.py. DFlashTrainerModule
samples a set of anchor positions per sequence, builds one parallel draft block
per anchor (the block’s first token is the real anchor token, the rest are
MASK), runs the draft model under a bespoke block attention mask, and
computes a block-wise cross-entropy loss against the ground-truth continuation
of each anchor.
Module Contents
Classes
API
Per-step training outputs for the DFlash draft.
Bases: Module
DFlash online training wrapper with block-wise CE loss.
Embed each block as [anchor_token, MASK, MASK, ...] (invalid blocks all MASK).
Absolute position ids for the parallel draft blocks (anchor + offset).
Randomly sample anchor positions per sample; returns (anchors, keep_mask).
Parallel block-wise training forward pass.
Bases: ValueError
Raised when a batch has no sample long enough to form a DFlash block.
A DFlash anchor needs at least block_size + 1 supervised tokens (the
anchor plus its block). Datasets always contain some short conversations;
the training loop catches this and skips the offending micro-batch rather
than aborting the run.