TorchProcessGroup#
-
class nvmath.
distributed. TorchProcessGroup(
)[source]#
ProcessGroup implemented on
torch.distributed.- Parameters:
device_id – Device used by the
torch.distributedprocess group backend.torch_process_group –
torch.distributedprocess group handle (e.g. returned bytorch.distributed.new_group()), or None to use the default torch process group.
Attributes
- MIN_ALL_REDUCE_OBJ_BUFFER_SIZE = 128#
- allreduce_obj_buffer_size#
Current buffer size for allreduce_object() (in bytes)
- device_id#
Device used by the communication backend of this torch process group.
- nranks#
- rank#
Methods
- allreduce_buffer(
- array: ndarray,
- *,
- op: ReductionOp,
Allreduce an array.
- Parameters:
array – Input and output of the collective. The function operates in-place.
op – One of the values from
ReduceOpenum. Specifies an operation for element-wise reductions.
- allreduce_object(
- obj: T,
- *,
- op: Callable[[T, T], T],
Reduces all Python objects contributed by members of the group. The result is a single reduced object which is returned on every process.
- Parameters:
obj – object contributed by this process.
op – A Python function that takes two objects and returns a single (reduced) object.