Pipeline Run Methods#
DALI pipeline can be run in one of the following ways:
- Simple run method, which runs the computations and returns the results.This option corresponds to the
nvidia.dali.types.PipelineAPIType.BASIC()
API type. nvidia.dali.Pipeline.schedule_run()
,nvidia.dali.Pipeline.share_outputs()
,nvidia.dali.Pipeline.release_outputs()
that allows a fine-grain control for the duration of the output buffers’ lifetime.This option corresponds to thenvidia.dali.types.PipelineAPIType.SCHEDULED()
API type.- Built-in iterators for PyTorch, JAX, PaddlePaddle, and TensorFlow.This option corresponds to the
nvidia.dali.types.PipelineAPIType.ITERATOR()
API type.
The first API, nvidia.dali.Pipeline.run()
method completes the following tasks:
Launches the DALI pipeline.
Executes the prefetch iterations if necessary.
Waits until the first batch is ready.
Returns the resulting buffers.
Buffers are marked as in-use until the next call to
nvidia.dali.Pipeline.run()
. This process can be wasteful because the data is usually
copied to the DL framework’s native storage objects and DALI pipeline outputs could be returned to
DALI for reuse.
The second API, which consists of nvidia.dali.Pipeline.schedule_run()
,
nvidia.dali.Pipeline.share_outputs()
, and nvidia.dali.Pipeline.release_outputs()
allows you to explicitly manage the lifetime of the output buffers. The
nvidia.dali.Pipeline.schedule_run()
method instructs DALI to prepare the next
batch of data, and, if necessary, to prefetch. If the execution mode is set to asynchronous,
this call returns immediately, without waiting for the results. This way, another task can be
simultaneously executed. The data batch can be requested from DALI by calling
nvidia.dali.Pipeline.share_outputs()
, which returns the result buffer. If the data
batch is not yet ready, DALI will wait for it. The data is ready as soon as the
nvidia.dali.Pipeline.share_outputs`()
is complete. When the DALI buffers are
no longer needed, because data was copied or has already been consumed, call
nvidia.dali.Pipeline.release_outputs()
to return the DALI buffers for reuse
in subsequent iterations.
Built-in iterators use the second API to provide convenient wrappers for immediate use in
Deep Learning Frameworks. The data is returned in the framework’s native buffers. The iterator’s
implementation copies the data internally from DALI buffers and recycles the data by calling
nvidia.dali.Pipeline.release_outputs()
.
We recommend that you do not mix the APIs. The APIs follow a different logic for the output buffer lifetime management, and the details of the process are subject to change without notice. Mixing the APIs might result in undefined behavior, such as a deadlock or an attempt to access an invalid buffer.