nemo_automodel.components.speculative.eagle.remote.wire
nemo_automodel.components.speculative.eagle.remote.wire
Compact binary tensor serialization for the remote target data plane.
This is the fallback path used when NCCL GPU-to-GPU transfer is unavailable: tensors are encoded as dtype + shape + raw contiguous bytes and shipped inside the HTTP body. The format is little-endian and self-delimiting.
Format::
[4B] magic 0x4E4D4554 (“NMET”) per entry: [4B] key_len (uint32) [key_len B] key UTF-8 [1B] flags bit0 = is_none if not none: [1B] dtype_code (see _DTYPE_TABLE) [1B] ndim [ndim x 8B] shape (int64) [8B] nbytes (uint64) [nbytes B] data raw contiguous tensor bytes
Module Contents
Functions
Data
API
Decode a wire-format blob back into a dict of tensors on map_location.
Encode a dict of CPU tensors into the wire format.
None values are preserved. The caller is responsible for moving tensors
to CPU first; CUDA tensors are rejected to keep the data path explicit.
Encode and return immutable bytes (HTTP body).