bridge.models.decorators.torchrun
#
Module Contents#
Functions#
A decorator that wraps the main function of a torchrun script. It uses
the |
API#
- bridge.models.decorators.torchrun.torchrun_main(fn)#
A decorator that wraps the main function of a torchrun script. It uses the
torch.distributed.elastic.multiprocessing.errors.record
decorator to record any exceptions and ensures that the distributed process group is properly destroyed on successful completion. In case of an exception, it prints the traceback and performs a hard exit, allowing torchrun to terminate all other processes.