core.full_cuda_graph#

Full iteration CUDA graph for training.

Module Contents#

Classes#

StaticBufferLoader

Load data to static buffers.

FullCudaGraphWrapper

Wrapper class to enable FullIterationCUDAgraph.

Functions#

copy_tensors_in_struct

Copy src to new tensors.

clone_tensors_in_struct

Copy src to pre-existing tensors in tgt.

Data#

API#

core.full_cuda_graph.logger#

‘getLogger(…)’

core.full_cuda_graph.copy_tensors_in_struct(src)#

Copy src to new tensors.

core.full_cuda_graph.clone_tensors_in_struct(tgt, src)#

Copy src to pre-existing tensors in tgt.

class core.full_cuda_graph.StaticBufferLoader#

Load data to static buffers.

Initialization

static_buffers: dict#

None

__call__(inputs, stage, microbatch)#
class core.full_cuda_graph.FullCudaGraphWrapper(forward_backward_func, cuda_graph_warmup_steps=1)#

Wrapper class to enable FullIterationCUDAgraph.

Initialization

curr_iteration#

None

cuda_graph#

None

result#

None

data_read(data_iterator, model, training, num_microbatches)#

Read all microbatch inputs from Dataloader and copy to static buffers.

__call__(*args, **kwargs)#
curr_iter(stage)#

Return current training/validation iteration.

next_iter(stage)#

Increment current training/validation iteration.