dynamo.nixl_connect.WritableOperation#

An operation which enables a remote worker to write data to the local worker.

To create the operation, a set of local Descriptor objects must be provided which reference memory intended to receive data from a remote worker. Once created, the memory referenced by the provided descriptors becomes immediately writable by a remote worker with the necessary metadata. The RDMA metadata (RdmaMetadata) required to access the memory referenced by the provided descriptors is accessible via the operations .metadata() method. Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.

Disposal of the object will instruct the RDMA subsystem to cancel the operation, therefore the operation should be awaited until completed unless cancellation is intended. Cancellation is handled asynchronously.

Example Usage#

    async def recv_data(
      self,
      local_tensor: torch.Tensor
    ) -> None:
      descriptor = dynamo.nixl_connect.Descriptor(local_tensor)

      with self.connector.create_writable(descriptor) as write_op:
        op_metadata = write_op.metadata()

        # Send the metadata to the remote worker via sideband communication.
        await self.request_remote_data(op_metadata)
        # Wait the remote worker to complete its write operation to local_tensor.
        # AKA receive data from remote worker.
        await write_op.wait_for_completion()

Methods#

metadata#

def metadata(self) -> RdmaMetadata:

Generates and returns the RDMA metadata (RdmaMetadata) required for a remote worker to write to the operation. Once acquired, the metadata needs to be provided to a remote worker via a secondary channel, most likely HTTP or TCP+NATS.

wait_for_completion#

async def wait_for_completion(self) -> None:

Blocks the caller until the operation has received a completion signal from a remote worker.

Properties#

status#

@property
def status(self) -> OperationStatus:

Returns OperationStatus which provides the current state (aka. status) of the operation.