core.inference.engines.async_zmq_communicator#

Module Contents#

Classes#

AsyncZMQCommunicator

An asyncio-friendly communicator abstraction using ZMQ. Can be used to implement collective operations like all-reduce, and bcast which are asyncio friendly on top of ZMQ sockets. Only to be used with small amounts of data (e.g., 1 integer) on the CPU.

API#

class core.inference.engines.async_zmq_communicator.AsyncZMQCommunicator(
zmq_context: zmq.Context,
process_group: torch.distributed.ProcessGroup,
)#

An asyncio-friendly communicator abstraction using ZMQ. Can be used to implement collective operations like all-reduce, and bcast which are asyncio friendly on top of ZMQ sockets. Only to be used with small amounts of data (e.g., 1 integer) on the CPU.

Initialization

Constructor for AsyncZMQCommunicator. Sets up ZMQ sockets for communication among ranks in the given process group.

Parameters:
  • zmq_context (zmq.Context) – ZMQ context to create sockets.

  • process_group (dist.ProcessGroup) – Process group for communication.

async all_reduce_max(local_val: int) int#

Asyncio friendly all reduce max operation. Gathers on rank 0, computes max, and broadcasts the result.

close()#

Close the ZMQ sockets.