core.dist_checkpointing.strategies.nvrx#
Helpers for interacting with the experimental nvidia-resiliency-ext API.
Module Contents#
Functions#
Checks whether the NVRx async checkpointing symbols Megatron uses are importable. |
|
Builds an AsyncRequest using the expected NVRx API. |
API#
- core.dist_checkpointing.strategies.nvrx.has_nvrx_async_support() bool#
Checks whether the NVRx async checkpointing symbols Megatron uses are importable.
- core.dist_checkpointing.strategies.nvrx.make_nvrx_async_request(
- async_request_cls: type,
- async_fn: Callable[..., Any],
- async_fn_args: Any,
- finalize_fns: list[Callable[..., Any]],
- async_fn_kwargs: Dict[str, Any] | None = None,
- preload_fn: Callable[..., Any] | None = None,
Builds an AsyncRequest using the expected NVRx API.