nemo_automodel.components.speculative.dspark.markov_head
nemo_automodel.components.speculative.dspark.markov_head
Module Contents
Classes
Functions
Data
API
Bases: VanillaMarkov
RNN-based head that maintains recurrent state across positions within a block.
Unlike the memoryless Markov heads, position k can access the full prefix history x_{<k} through a GRU-like recurrent state.
Single RNN step.
Parameters:
[*, r] previous recurrent state
[*, r] W1[x_{k-1}]
[*, d] backbone hidden at step k
Returns: torch.Tensor
[*, r]
Apply RNN bias during training (teacher-forced, unrolled over block_size).
Parameters:
[B, num_blocks, block_size, V]
[B, num_blocks, block_size] - prev token ids for each position
[B, num_blocks, block_size, d]
Stateless single-step bias (state initialized to zero).
This is used for compatibility but does NOT carry state across steps. For full RNN behavior, use apply_block_logits or sample_block_tokens.
Autoregressive sampling with RNN state.
Parameters:
[batch, proposal_len, vocab]
[batch] - token preceding the first draft position
[batch, proposal_len, d]
sampling temperature
Returns: torch.Tensor
[batch, proposal_len]
Bases: Module