dynamo.nixl_connect.ReadableOperation#

An operation which enables a remote worker to read data from the local worker.

To create the operation, a set of local Descriptor objects must be provided that reference memory intended to be transferred to a remote worker. Once created, the memory referenced by the provided descriptors becomes immediately readable by a remote worker with the necessary metadata. The RDMA metadata (RdmaMetadata) required to access the memory referenced by the provided descriptors is accessible via the operations .metadata() method. Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.

Disposal of the object will instruct the RDMA subsystem to cancel the operation, therefore the operation should be awaited until completed unless cancellation is intended.

Example Usage#

    async def send_data(
      self,
      local_tensor: torch.Tensor
    ) -> None:
      descriptor = dynamo.nixl_connect.Descriptor(local_tensor)

      with self.connector.create_readable(descriptor) as read_op:
        op_metadata = read_op.metadata()

        # Send the metadata to the remote worker via sideband communication.
        await self.notify_remote_data(op_metadata)
        # Wait for the remote worker to complete its read operation of local_tensor.
        # AKA send data to remote worker.
        await read_op.wait_for_completion()

Methods#

metadata#

def metadata(self) -> RdmaMetadata:

Generates and returns the RDMA metadata (RdmaMetadata) required for a remote worker to read from the operation. Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.

wait_for_completion#

async def wait_for_completion(self) -> None:

Blocks the caller until the operation has received a completion signal from a remote worker.

Properties#

status#

@property
def status(self) -> OperationStatus:

Returns OperationStatus which provides the current state (aka. status) of the operation.