TorchData Integration Reference#
DALI Dynamic provides integration with torchdata.nodes to build composable
data loading pipelines. The following node classes can be composed with standard
torchdata.nodes building blocks such as Prefetcher
and Loader.
Reader#
- class nvidia.dali.experimental.dynamic.pytorch.nodes.Reader(reader_type, *, batch_size, output_names=None, **kwargs)#
Wraps a reader as a node, yielding dictionaries.
- Parameters:
reader_type¶ (reader subclass) – The type of the reader to construct.
batch_size¶ (int, optional) – The batch size to pass to next_epoch(). If None, the iterator returns tensors.
output_names¶ (iterable of str, optional) – Names of the outputs, used as keys in the output dict. If the reader has two outputs, it can be omited and defaults to
("data", "label").**kwargs¶ – Additional keyword arguments to pass to the reader constructor.
- get_metadata()#
Returns the metadata of the underlying reader operator
- get_state()#
Subclasses must implement this method, instead of
state_dict(). Should only be called by BaseNode.- Returns:
Dict[str, Any] - a state dict that may be passed to
reset()at some point in the future
- next()#
Subclasses must implement this method, instead of
__next__. Should only be called by BaseNode.- Returns:
T - the next value in the sequence, or throw StopIteration
- reset(initial_state=None)#
Resets the iterator to the beginning, or to the state passed in by initial_state.
Reset is a good place to put expensive initialization, as it will be lazily called when
next()orstate_dict()is called. Subclasses must callsuper().reset(initial_state).- Parameters:
initial_state¶ – Optional[dict] - a state dict to pass to the node. If None, reset to the beginning.
DictMapper#
- class nvidia.dali.experimental.dynamic.pytorch.nodes.DictMapper(source, map_fn, key='data')#
Applies a transform to a single key in the dict yielded by a source node.
- Parameters:
source¶ (
torchdata.nodes.BaseNode) – The source node to pull from. Yields dictionaries of tensors or batches.map_fn¶ (callable) – The function to apply to the specified key. Must return a tensor or batch.
key¶ (str, optional) – The key to apply the function to. Defaults to
"data".
- get_state()#
Subclasses must implement this method, instead of
state_dict(). Should only be called by BaseNode.- Returns:
Dict[str, Any] - a state dict that may be passed to
reset()at some point in the future
- next()#
Subclasses must implement this method, instead of
__next__. Should only be called by BaseNode.- Returns:
T - the next value in the sequence, or throw StopIteration
- reset(initial_state=None)#
Resets the iterator to the beginning, or to the state passed in by initial_state.
Reset is a good place to put expensive initialization, as it will be lazily called when
next()orstate_dict()is called. Subclasses must callsuper().reset(initial_state).- Parameters:
initial_state¶ – Optional[dict] - a state dict to pass to the node. If None, reset to the beginning.
ToTorch#
- class nvidia.dali.experimental.dynamic.pytorch.nodes.ToTorch(source, output_stream=None)#
Converts dictionaries of tensors or batches to tuples of
torch.Tensor.- Parameters:
source¶ (
torchdata.nodes.BaseNode) – The source node to pull data from. Yields dictionaries of tensors or batches.output_stream¶ (a compatible stream object, optional) – The CUDA stream on which the output tensors will be used. If provided, ensure that work on this stream will wait for any pending GPU operations before the tensors are consumed. Defaults to the current CUDA stream at the time of construction.
- get_state()#
Subclasses must implement this method, instead of
state_dict(). Should only be called by BaseNode.- Returns:
Dict[str, Any] - a state dict that may be passed to
reset()at some point in the future
- next()#
Subclasses must implement this method, instead of
__next__. Should only be called by BaseNode.- Returns:
T - the next value in the sequence, or throw StopIteration
- reset(initial_state=None)#
Resets the iterator to the beginning, or to the state passed in by initial_state.
Reset is a good place to put expensive initialization, as it will be lazily called when
next()orstate_dict()is called. Subclasses must callsuper().reset(initial_state).- Parameters:
initial_state¶ – Optional[dict] - a state dict to pass to the node. If None, reset to the beginning.
Usage Pattern#
A typical pipeline composes these nodes with torchdata.nodes utilities:
import nvidia.dali.experimental.dynamic as ndd
import torchdata.nodes as tn
reader_node = ndd.pytorch.nodes.Reader(
ndd.readers.File,
batch_size=batch_size,
file_root=data_dir,
random_shuffle=True,
)
mapper_node = ndd.pytorch.nodes.DictMapper(
source=reader_node,
map_fn=my_processing_function,
)
torch_node = ndd.pytorch.nodes.ToTorch(mapper_node)
prefetch_node = tn.Prefetcher(torch_node, prefetch_factor=2)
loader = tn.Loader(prefetch_node)
for images, labels in loader:
# images, labels are torch.Tensors on GPU
...
The above snippet defines the following simple graph:
![digraph simple_pipeline {
rankdir=LR;
bgcolor="transparent";
margin="0.71, 0";
dpi=300;
node [shape=box, style="filled,rounded", fontname="NVIDIA Sans, sans-serif",
fontsize=14, penwidth=0];
edge [color="#707070"];
Reader [fillcolor="#76B900", fontcolor="white", label="Reader"];
DictMapper [fillcolor="#76B900", fontcolor="white", label="DictMapper"];
ToTorch [fillcolor="#76B900", fontcolor="white", label="ToTorch"];
Prefetcher [fillcolor="#DE3412", fontcolor="white", label="Prefetcher"];
Reader -> DictMapper -> ToTorch -> Prefetcher;
}](../_images/graphviz-90f146553c731a17e5bb0a2d6d0eb5e06f1d9f05.png)
torchdata.nodes allows composing nodes to define complex graphs. For instance,DictMapper nodeReader. torchdata.nodes.Mapper() can be used to![digraph complex_pipeline {
rankdir=LR;
bgcolor="transparent";
dpi=300;
node [shape=box, style="filled,rounded", fontname="NVIDIA Sans, sans-serif",
fontsize=14, penwidth=0, fontcolor="white"];
edge [color="#707070"];
Reader [fillcolor="#76B900", label="Reader"];
DictMapper1 [fillcolor="#76B900", label="DictMapper"];
DictMapper2 [fillcolor="#76B900", label="DictMapper"];
Mapper [fillcolor="#DE3412", label="Mapper"];
ToTorch [fillcolor="#76B900", label="ToTorch"];
Prefetcher [fillcolor="#DE3412", label="Prefetcher"];
Reader -> DictMapper1 -> Mapper;
Reader -> DictMapper2 -> Mapper;
Mapper -> ToTorch -> Prefetcher;
}](../_images/graphviz-9bc1370b6ae0dd00246668e82c18bf8abee7f99a.png)