core.inference.contexts.base_context#

Module Contents#

Classes#

BaseInferenceContext

Base class for inference contexts.

API#

class core.inference.contexts.base_context.BaseInferenceContext(
inference_config: megatron.core.inference.config.InferenceConfig,
)#

Bases: abc.ABC

Base class for inference contexts.

Currently extended by StaticInferenceContext and DynamicInferenceContext. Extend this class for any future contexts types.

Initialization

Args:

abstractmethod is_static_batching() bool#

Return True if context uses static batching.

is_dynamic_batching() bool#

Return True if context uses dynamic batching.

increment_sequence_len_offset(increment: int) None#

Update sequence length offset. No-op for dynamic batching.

increment_batch_size_offset(increment: int) None#

Update batch size offset. No-op for dynamic batching.

reset_batch_size_offset() None#

Reset batch size offset to 0. No-op for dynamic batching.