nemo_automodel.components.speculative.eagle.remote.protocol

View as Markdown

Shared wire protocol between the remote target server and client.

The control plane is HTTP: the client POSTs input_ids and receives, in the NCCL data path, only tensor metadata (dtype + shape) as JSON so it knows what to recv; the actual tensors arrive over NCCL. In the fallback path the body is the binary :mod:wire blob instead.

Module Contents

Functions

NameDescription
decode_nccl_metadataDecode the JSON metadata body into (keys_order, metadata).
dtype_from_codeMap a wire dtype code back to a torch.dtype.
encode_nccl_metadataEncode tensor metadata (dtype code + shape) as a JSON HTTP body.

Data

EP_DISCONNECT

EP_GENERATE

EP_HEALTH

EP_HEARTBEAT

EP_INIT_NCCL

EP_INPUT_EMBEDDINGS

EP_MODEL_INFO

EP_SET_VOCAB_MAPPING

NCCL_HEADER

SUPERVISION_KEYS

API

nemo_automodel.components.speculative.eagle.remote.protocol.decode_nccl_metadata(
raw: bytes
) -> tuple[list[str], dict[str, typing.Optional[dict]]]

Decode the JSON metadata body into (keys_order, metadata).

nemo_automodel.components.speculative.eagle.remote.protocol.dtype_from_code(
code: int
) -> torch.dtype

Map a wire dtype code back to a torch.dtype.

nemo_automodel.components.speculative.eagle.remote.protocol.encode_nccl_metadata(
tensor_dict: dict[str, typing.Optional[torch.Tensor]],
keys_order: list[str]
) -> bytes

Encode tensor metadata (dtype code + shape) as a JSON HTTP body.

Only metadata is encoded — no tensor data. The client uses it to allocate the receive buffers before the NCCL recv.

nemo_automodel.components.speculative.eagle.remote.protocol.EP_DISCONNECT = 'disconnect'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_GENERATE = 'generate'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_HEALTH = 'health'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_HEARTBEAT = 'heartbeat'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_INIT_NCCL = 'init_nccl'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_INPUT_EMBEDDINGS = 'input_embeddings'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_MODEL_INFO = 'model_info'
nemo_automodel.components.speculative.eagle.remote.protocol.EP_SET_VOCAB_MAPPING = 'set_vocab_mapping'
nemo_automodel.components.speculative.eagle.remote.protocol.NCCL_HEADER = 'X-NeMo-NCCL'
nemo_automodel.components.speculative.eagle.remote.protocol.SUPERVISION_KEYS = ['aux_hidden_states', 'target_probs', 'position_mask', 'input_ids', 'loss_mask']