core.inference.contexts.base_context#

Module Contents#

Classes#

BaseInferenceContext

Base class for inference contexts.

API#

class core.inference.contexts.base_context.BaseInferenceContext(materialize_only_last_token_logits: bool)#

Bases: abc.ABC

Base class for inference contexts.

Currently extended by StaticInferenceContext and DynamicInferenceContext. Extend this class for any future contexts types.

Initialization

Parameters:

materialize_only_last_token_logits (bool) – If True, only the last-token logits will be extracted during decode

abstractmethod is_static_batching() bool#

Return True if context uses static batching.

is_dynamic_batching() bool#

Return True if context uses dynamic batching.

increment_sequence_len_offset(increment: int) None#

Update sequence length offset. No-op for dynamic batching.

increment_batch_size_offset(increment: int) None#

Update batch size offset. No-op for dynamic batching.

reset_batch_size_offset() None#

Reset batch size offset to 0. No-op for dynamic batching.