core.inference.contexts.base_context#
Module Contents#
Classes#
Base class for inference contexts. |
API#
- class core.inference.contexts.base_context.BaseInferenceContext(materialize_only_last_token_logits: bool)#
Bases:
abc.ABCBase class for inference contexts.
Currently extended by
StaticInferenceContextandDynamicInferenceContext. Extend this class for any future contexts types.Initialization
- Parameters:
materialize_only_last_token_logits (bool) – If True, only the last-token logits will be extracted during decode
- abstractmethod is_static_batching() bool#
Return
Trueif context uses static batching.
- is_dynamic_batching() bool#
Return
Trueif context uses dynamic batching.
- increment_sequence_len_offset(increment: int) None#
Update sequence length offset. No-op for dynamic batching.
- increment_batch_size_offset(increment: int) None#
Update batch size offset. No-op for dynamic batching.
- reset_batch_size_offset() None#
Reset batch size offset to 0. No-op for dynamic batching.