core.full_cuda_graph#
Full iteration CUDA graph for training.
Module Contents#
Classes#
Load data to static buffers. |
|
Wrapper class to enable FullIterationCUDAgraph. |
Functions#
Copy src to new tensors. |
|
Copy src to pre-existing tensors in tgt. |
Data#
API#
- core.full_cuda_graph.logger#
‘getLogger(…)’
- core.full_cuda_graph.copy_tensors_in_struct(src)#
Copy src to new tensors.
- core.full_cuda_graph.clone_tensors_in_struct(tgt, src)#
Copy src to pre-existing tensors in tgt.
- class core.full_cuda_graph.StaticBufferLoader#
Load data to static buffers.
Initialization
- static_buffers: dict#
None
- __call__(inputs, stage, microbatch)#
- class core.full_cuda_graph.FullCudaGraphWrapper(forward_backward_func, cuda_graph_warmup_steps=1)#
Wrapper class to enable FullIterationCUDAgraph.
Initialization
- curr_iteration#
None
- cuda_graph#
None
- result#
None
- data_read(data_iterator, model, training, num_microbatches)#
Read all microbatch inputs from Dataloader and copy to static buffers.
- __call__(*args, **kwargs)#
- curr_iter(stage)#
Return current training/validation iteration.
- next_iter(stage)#
Increment current training/validation iteration.