TensorFlow Plugin API reference¶
- class nvidia.dali.plugin.tf.DALIDataset(pipeline, output_dtypes=None, output_shapes=None, fail_on_device_mismatch=True, *, input_datasets=None, batch_size=1, num_threads=4, device_id=0, exec_separated=False, prefetch_queue_depth=2, cpu_prefetch_queue_depth=2, gpu_prefetch_queue_depth=2, dtypes=None, shapes=None)¶
Creates a
DALIDataset
compatible with tf.data.Dataset from a DALI pipeline. It supports TensorFlow 1.15 and 2.x family.DALIDataset
can be placed on CPU and GPU.Please keep in mind that TensorFlow allocates almost all available device memory by default. This might cause errors in DALI due to insufficient memory. On how to change this behaviour please look into the TensorFlow documentation, as it may differ based on your use case.
Warning
Most TensorFlow Datasets have only CPU variant. To process GPU-placed
DALIDataset
by other TensorFlow dataset you need to first copy it back to CPU using explicittf.data.experimental.copy_to_device
- roundtrip from CPU to GPU back to CPU would probably degrade performance a lot and is thus discouraged.Additionally, it is advised to not use datasets like
repeat()
or similar afterDALIDataset
, which may interfere with DALI memory allocations and prefetching.- Parameters:
pipeline (
nvidia.dali.Pipeline
) – defining the data processing to be performed.output_dtypes (tf.DType or tuple of tf.DType, default = None) – expected output types
output_shapes (tuple of shapes, optional, default = None) – expected output shapes. If provided, must match arity of the
output_dtypes
. When set to None, DALI will infer the shapes on its own. Individual shapes can be also set to None or contain None to indicate unknown dimensions. If specified must be compatible with shape returned from DALI Pipeline and withbatch_size
argument which will be the outermost dimension of returned tensors. In case ofbatch_size = 1
it can be omitted in the shape. DALI Dataset will try to match requested shape by squeezing 1-sized dimensions from shape obtained from Pipeline.fail_on_device_mismatch (bool, optional, default = True) –
When set to
True
runtime check will be performed to ensure DALI device and TF device are both CPU or both GPU. In some contexts this check might be inaccurate. When set toFalse
will skip the check but print additional logs to check the devices. Keep in mindthat this may allow hidden GPU to CPU copies in the workflow and impact performance.
batch_size (int, optional, default = 1) – batch size of the pipeline.
num_threads (int, optional, default = 4) – number of CPU threads used by the pipeline.
device_id (int, optional, default = 0) – id of GPU used by the pipeline. A None value for this parameter means that DALI should not use GPU nor CUDA runtime. This limits the pipeline to only CPU operators but allows it to run on any CPU capable machine.
exec_separated (bool, optional, default = False) – Whether to execute the pipeline in a way that enables overlapping CPU and GPU computation, typically resulting in faster execution speed, but larger memory consumption.
prefetch_queue_depth (int, optional, default = 2) – depth of the executor queue. Deeper queue makes DALI more resistant to uneven execution time of each batch, but it also consumes more memory for internal buffers. Value will be used with
exec_separated
set toFalse
.cpu_prefetch_queue_depth (int, optional, default = 2) – depth of the executor cpu queue. Deeper queue makes DALI more resistant to uneven execution time of each batch, but it also consumes more memory for internal buffers. Value will be used with
exec_separated
set toTrue
.gpu_prefetch_queue_depth (int, optional, default = 2) – depth of the executor gpu queue. Deeper queue makes DALI more resistant to uneven execution time of each batch, but it also consumes more memory for internal buffers. Value will be used with
exec_separated
set toTrue
.
- Return type:
DALIDataset
object based on DALI pipeline and compatible withtf.data.Dataset
API.
- nvidia.dali.plugin.tf.DALIIterator()¶
TF Plugin Wrapper
This operator works in the same way as DALI TensorFlow plugin, with the exception that it also accepts Pipeline objects as an input, which are serialized internally. For more information, see
nvidia.dali.plugin.tf.DALIRawIterator()
.
- nvidia.dali.plugin.tf.DALIIteratorWrapper(pipeline=None, serialized_pipeline=None, sparse=[], shapes=[], dtypes=[], batch_size=-1, prefetch_queue_depth=2, **kwargs)¶
TF Plugin Wrapper
This operator works in the same way as DALI TensorFlow plugin, with the exception that it also accepts Pipeline objects as an input, which are serialized internally. For more information, see
nvidia.dali.plugin.tf.DALIRawIterator()
.
- nvidia.dali.plugin.tf.DALIRawIterator()¶
DALI TensorFlow plugin
Creates a DALI pipeline from a serialized pipeline, obtained from serialized_pipeline argument. shapes must match the shape of the coresponding DALI Pipeline output tensor shape. dtypes must match the type of the coresponding DALI Pipeline output tensors type.
- Parameters:
serialized_pipeline – A string.
shapes – A list of shapes (each a tf.TensorShape or list of ints) that has length >= 1.
dtypes – A list of tf.DTypes from: tf.half, tf.float32, tf.uint8, tf.int16, tf.int32, tf.int64 that has length >= 1.
num_threads – An optional int. Defaults to -1.
device_id – An optional int. Defaults to -1.
exec_separated – An optional bool. Defaults to False.
gpu_prefetch_queue_depth – An optional int. Defaults to 2.
cpu_prefetch_queue_depth – An optional int. Defaults to 2.
sparse – An optional list of bools. Defaults to [].
batch_size – An optional int. Defaults to -1.
enable_memory_stats – An optional bool. Defaults to False.
name – A name for the operation (optional).
- Returns:
A list of Tensor objects of type dtypes.
Please keep in mind that TensorFlow allocates almost all available device memory by default. This might cause errors in DALI due to insufficient memory. On how to change this behaviour please look into the TensorFlow documentation, as it may differ based on your use case.
- nvidia.dali.plugin.tf.dataset_compatible_tensorflow()¶
Returns
True
if current TensorFlow version is compatible with DALIDataset.
- nvidia.dali.plugin.tf.dataset_distributed_compatible_tensorflow()¶
Returns
True
if the tf.distribute APIs for current TensorFlow version are compatible with DALIDataset.
- nvidia.dali.plugin.tf.dataset_inputs_compatible_tensorflow()¶
Returns
True
if the current TensorFlow version is compatible with experimental.DALIDatasetWithInputs and input Datasets can be used with DALI.
- nvidia.dali.plugin.tf.dataset_options()¶
- nvidia.dali.plugin.tf.serialize_pipeline(pipeline)¶
Experimental¶
- nvidia.dali.plugin.tf.experimental.DALIDatasetWithInputs(pipeline, output_dtypes=None, output_shapes=None, fail_on_device_mismatch=True, *, input_datasets=None, batch_size=1, num_threads=4, device_id=0, exec_separated=False, prefetch_queue_depth=2, cpu_prefetch_queue_depth=2, gpu_prefetch_queue_depth=2, dtypes=None, shapes=None)¶
Experimental variant of
DALIDataset
. This dataset adds support for input tf.data.Datasets. Support for input tf.data.Datasets is available only for TensorFlow 2.4.1 and newer.Input dataset specification
Each of the input datasets must be mapped to a
external_source()
operator that will represent the input to the DALI pipeline. In the pipeline the input is represented as thename
parameter ofexternal_source()
. Input datasets must be provided as a mapping from thatname
to the dataset object via theinput_datasets
dictionary argument of DALIDatasetWithInputs.Per-sample and batch mode
The input datasets can operate in per-sample mode or in batch mode.
In per-sample mode, the values produced by the source dataset are interpreted as individual samples. The batch dimension is absent. For example, a 640x480 RGB image would have a shape
[480, 640, 3]
.In batch mode, the tensors produced by the source dataset are interpreted as batches, with an additional outer dimension denoting the samples in the batch. For example, a batch of ten 640x480 RGB images would have a shape
[10, 480, 640, 3]
.In both cases (per-sample and batch mode), the layout of those inputs should be denoted as “HWC”.
In per-sample mode DALIDataset will query the inputs dataset
batch_size
-times to build a batch that would be fed into the DALI Pipeline. In per-sample mode, each sample produced by the input dataset can have a different shape, but the number of dimension and the layout must remain constant.External Source with
source
parameterThis experimental DALIDataset accepts pipelines with
external_source()
nodes that havesource
parameter specified. In that case, thesource
will be converted automatically into appropriatetf.data.Dataset.from_generator
dataset with correct placement andtf.data.experimental.copy_to_device
directives.Those nodes can also work in per-sample or in batch mode. The data in batch mode must be a dense, uniform tensor (each sample has the same dimensions). Only CPU data is accepted.
This allows TensorFlow DALIDataset to work with most Pipelines that have External Source
source
already specified.Warning
This class is experimental and its API might change without notice.
Note
External source nodes with
num_outputs
specified to any number are not supported - this means that callbacks with multiple (tuple) outputs are not supported.Note
External source
cycle
policy'raise'
is not supported - the dataset is not restartable.Note
External source
cuda_stream
parameter is ignored -source
is supposed to return CPU data and tf.data.Dataset inputs are handled internally.Note
External source
use_copy_kernel
andblocking
parameters are ignored.Note
Setting
no_copy
on the external source nodes when defining the pipeline is considered a no-op when used with DALI Dataset. Theno_copy
option is handled internally and enabled automatically if possible.Note
Parallel execution of external source callback provided via
source
is not supported. The callback is executed via TensorFlowtf.data.Dataset.from_generator
- theparallel
andprefetch_queue_depth
parameters are ignored.The operator adds additional parameters to the ones supported by the
DALIDataset
:- Parameters:
input_datasets (dict[str, tf.data.Dataset] or) –
dict[str, nvidia.dali.plugin.tf.experimental.Input] input datasets to the DALI Pipeline. It must be provided as a dictionary mapping from the names of the
External Source
nodes to the datasets objects or to theInput()
wrapper.For example:
{ 'tensor_input': tf.data.Dataset.from_tensors(tensor).repeat(), 'generator_input': tf.data.Dataset.from_generator(some_generator) }
can be passed as
input_datasets
for Pipeline like:@pipeline_def def external_source_pipe(): input_0 = fn.external_source(name='tensor_input') input_1 = fn.external_source(name='generator_input') return fn.resize(input_1, resize_x=input_0)
Entries that use
tf.data.Dataset
directly, like:{ 'input': tf.data.Dataset.from_tensors(tensor) }
are equivalent to following specification using
nvidia.dali.plugin.tf.experimental.Input
:{ 'input' : nvidia.dali.plugin.tf.experimental.Input( dataset=tf.data.Dataset.from_tensors(tensor), layout=None, batch=False) }
This means that inputs, specified as
tf.data.Dataset
directly, are considered sample inputs.Warning
Input dataset must be placed on the same device as
DALIDatasetWithInputs
. If the input has different placement (for instance, input is placed on CPU, whileDALIDatasetWithInputs
is placed on GPU) thetf.data.experimental.copy_to_device
with GPU argument must be first applied to input.
- nvidia.dali.plugin.tf.experimental.Input(dataset, *, layout=None, batch=False)¶
Wrapper for an input passed to DALIDataset. Allows to pass additional options that can override some of the ones specified in the External Source node in the Python Pipeline object. Passing None indicates, that the value should be looked up in the pipeline definition.
- Parameters:
dataset (tf.data.Dataset) – The dataset used as an input
layout (str, optional, default = None) –
Layout of the input. If None, the layout will be taken from the corresponding External Source node in the Python Pipeline object. If both are provided,
the layouts must be the same.
If neither is provided, empty layout will be used.
batch (bool, optional, default = False) –
Batch mode of a given input. If None, the batch mode will be taken from the corresponding External Source node in the Python Pipeline object.
If the
batch = False
, the input dataset is considered sample input.If the
batch = True
, the input dataset is expected to return batches.