nemo_rl.weight_sync.collective_weight_synchronizer#

NCCL collective weight synchronizer for non-colocated deployments.

Handles weight transfer between policy and generation workers running on separate GPU clusters using NCCL collective communication. The policy broadcasts its weights, and generation workers receive them via the established NCCL process group.

Lifecycle per sync:

  1. policy.broadcast_weights_for_collective() – send via NCCL generation.update_weights_from_collective() – receive via NCCL

  2. Verify transfer success

No offload/restore steps are needed since policy and generation run on separate GPUs with dedicated memory.

Module Contents#

Classes#

CollectiveWeightSynchronizer

Weight synchronizer using NCCL collectives for non-colocated deployments.

API#

class nemo_rl.weight_sync.collective_weight_synchronizer.CollectiveWeightSynchronizer(
policy: Any,
generation: Any,
train_cluster: Any,
inference_cluster: Any,
)#

Bases: nemo_rl.weight_sync.interfaces.WeightSynchronizer

Weight synchronizer using NCCL collectives for non-colocated deployments.

Policy and generation workers run on separate GPU clusters. Weights are synchronized via NCCL broadcast over a pre-established process group.

Parameters:
  • policy – Policy object implementing ColocatablePolicyInterface.

  • generation – Generation object implementing GenerationInterface.

  • train_cluster – RayVirtualCluster for the training workers, used to obtain the master address/port and world size for collective init.

  • inference_cluster – RayVirtualCluster for the inference workers.

Initialization

sync_weights(
*,
timer: Optional[nemo_rl.utils.timer.Timer] = None,
kv_scales: Optional[dict[str, float]] = None,
) None#
property is_stale: bool#
mark_stale() None#
init_communicator() None#
shutdown() None#