nemo_automodel.components.speculative.dflash
nemo_automodel.components.speculative.dflash
DFlash speculative-decoding training components.
DFlash drafts a whole block of tokens in parallel via MASK-token “denoising”
conditioned on the target model’s hidden states, in contrast to EAGLE’s
autoregressive single-step drafting. See
nemo_automodel.components.speculative.dflash.core for the training wrapper.
Submodules
nemo_automodel.components.speculative.dflash.corenemo_automodel.components.speculative.dflash.draft_qwen3nemo_automodel.components.speculative.dflash.registrynemo_automodel.components.speculative.dflash.target