MXNet Plugin API reference¶
-
class
nvidia.dali.plugin.mxnet.
DALIClassificationIterator
(pipelines, size=-1, reader_name=None, data_name='data', label_name='softmax_label', data_layout='NCHW', fill_last_batch=None, auto_reset=False, squeeze_labels=True, dynamic_shape=False, last_batch_padded=False, last_batch_policy=<LastBatchPolicy.FILL: 0>, prepare_first_batch=True)¶ DALI iterator for classification tasks for MXNet. It returns 2 outputs (data and label) in the form of MXNet’s DataBatch of NDArrays.
Calling
DALIClassificationIterator(pipelines, reader_name, data_name, label_name, data_layout)
is equivalent to calling
DALIGenericIterator(pipelines, [(data_name, DALIClassificationIterator.DATA_TAG), (label_name, DALIClassificationIterator.LABEL_TAG)], reader_name, data_layout)
- Parameters
pipelines (list of nvidia.dali.Pipeline) – List of pipelines to use
size (int, default = -1) – Number of samples in the shard for the wrapped pipeline (if there is more than one it is a sum) Providing -1 means that the iterator will work until StopIteration is raised from the inside of iter_setup(). The options last_batch_policy and last_batch_padded don’t work in such case. It works with only one pipeline inside the iterator. Mutually exclusive with reader_name argument
reader_name (str, default = None) – Name of the reader which will be queried to the shard size, number of shards and all other properties necessary to count properly the number of relevant and padded samples that iterator needs to deal with. It automatically sets last_batch_policy to PARTIAL when the FILL is used, and last_batch_padded accordingly to match the reader’s configuration
data_name (str, optional, default = 'data') – Data name for provided symbols.
label_name (str, optional, default = 'softmax_label') – Label name for provided symbols.
data_layout (str, optional, default = 'NCHW') – Either ‘NHWC’ or ‘NCHW’ - layout of the pipeline outputs.
auto_reset (string or bool, optional, default = False) –
Whether the iterator resets itself for the next epoch or it requires reset() to be called explicitly.
It can be one of the following values:
"no"
,False
orNone
- at the end of epoch StopIteration is raised and reset() needs to be called"yes"
or"True"
- at the end of epoch StopIteration is raised but reset() is called internally automatically
squeeze_labels ((DEPRECATED) bool, optional, default = True) – Whether the iterator should squeeze the labels before copying them to the ndarray. This argument is deprecated and will be removed from future releases.
dynamic_shape (any, optional,) – Parameter used only for backward compatibility.
fill_last_batch (bool, optional, default = None) –
Deprecated Please use
last_batch_policy
insteadWhether to fill the last batch with data up to ‘self.batch_size’. The iterator would return the first integer multiple of self._num_gpus * self.batch_size entries which exceeds ‘size’. Setting this flag to False will cause the iterator to return exactly ‘size’ entries.
last_batch_policy (optional, default = LastBatchPolicy.FILL) – What to do with the last batch when there are not enough samples in the epoch to fully fill it. See
nvidia.dali.plugin.base_iterator.LastBatchPolicy()
. BothFILL
andPARTIAL
would return a full batch but thepad
property value of the returned array would differ.last_batch_padded (bool, optional, default = False) – Whether the last batch provided by DALI is padded with the last sample or it just wraps up. In the conjunction with
last_batch_policy
it tells if the iterator returning last batch with data only partially filled with data from the current epoch is dropping padding samples or samples from the next epoch (it doesn’t literally drop but setspad
field of ndarray so the following code could use it to drop the data). If set toFalse
next epoch will end sooner as data from it was consumed but dropped. If set to True next epoch would be the same length as the first one. For this to happen, the option pad_last_batch in the reader needs to be set to True as well. It is overwritten when reader_name argument is providedprepare_first_batch (bool, optional, default = True) – Whether DALI should buffer the first batch right after the creation of the iterator, so one batch is already prepared when the iterator is prompted for the data
Example
With the data set
[1,2,3,4,5,6,7]
and the batch size 2:last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = True -> last batch =
[7, 7]
and MXNet array property.pad=1
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = False -> last batch =
[7, 1]
and MXNet array property.pad=1
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = True -> last batch =
[7, 7]
and MXNet array property.pad=0
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = False -> last batch =
[7, 1]
and MXNet array property.pad=0
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = True -> last batch =
[5, 6]
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = False -> last batch =
[5, 6]
, next iteration will return[2, 3]
-
getdata
()¶ Get data of current batch.
- Returns
The data of the current batch.
- Return type
list of NDArray
-
getindex
()¶ Get index of the current batch.
- Returns
index – The indices of examples in the current batch.
- Return type
numpy.array
-
getlabel
()¶ Get label of the current batch.
- Returns
The label of the current batch.
- Return type
list of NDArray
-
getpad
()¶ Get the number of padding examples in the current batch.
- Returns
Number of padding examples in the current batch.
- Return type
int
-
iter_next
()¶ Move to the next batch.
- Returns
Whether the move is successful.
- Return type
boolean
-
next
()¶ Returns the next batch of data.
-
reset
()¶ Resets the iterator after the full epoch. DALI iterators do not support resetting before the end of the epoch and will ignore such request.
-
property
size
¶
-
class
nvidia.dali.plugin.mxnet.
DALIGenericIterator
(pipelines, output_map, size=-1, reader_name=None, data_layout='NCHW', fill_last_batch=None, auto_reset=False, squeeze_labels=True, dynamic_shape=False, last_batch_padded=False, last_batch_policy=<LastBatchPolicy.FILL: 0>, prepare_first_batch=True)¶ General DALI iterator for MXNet. It can return any number of outputs from the DALI pipeline in the form of MXNet’s DataBatch of NDArrays.
- Parameters
pipelines (list of nvidia.dali.Pipeline) – List of pipelines to use
output_map (list of (str, str)) – List of pairs (output_name, tag) which maps consecutive outputs of DALI pipelines to proper field in MXNet’s DataBatch. tag is one of DALIGenericIterator.DATA_TAG and DALIGenericIterator.LABEL_TAG mapping given output for data or label correspondingly. output_names should be distinct.
size (int, default = -1) – Number of samples in the shard for the wrapped pipeline (if there is more than one it is a sum) Providing -1 means that the iterator will work until StopIteration is raised from the inside of iter_setup(). The options last_batch_policy and last_batch_padded don’t work in such case. It works with only one pipeline inside the iterator. Mutually exclusive with reader_name argument
reader_name (str, default = None) – Name of the reader which will be queried to the shard size, number of shards and all other properties necessary to count properly the number of relevant and padded samples that iterator needs to deal with. It automatically sets last_batch_policy to PARTIAL when the FILL is used, and last_batch_padded accordingly to match the reader’s configuration
data_layout (str, optional, default = 'NCHW') – Either ‘NHWC’ or ‘NCHW’ - layout of the pipeline outputs.
auto_reset (string or bool, optional, default = False) –
Whether the iterator resets itself for the next epoch or it requires reset() to be called explicitly.
It can be one of the following values:
"no"
,False
orNone
- at the end of epoch StopIteration is raised
and reset() needs to be called *
"yes"
or"True"
- at the end of epoch StopIteration is raised but reset() is called internally automaticallysqueeze_labels ((DEPRECATED) bool, optional, default = False) – Whether the iterator should squeeze the labels before copying them to the ndarray. This argument is deprecated and will be removed from future releases.
dynamic_shape (any, optional,) – Parameter used only for backward compatibility.
fill_last_batch (bool, optional, default = None) –
Deprecated Please use
last_batch_policy
insteadWhether to fill the last batch with data up to ‘self.batch_size’. The iterator would return the first integer multiple of self._num_gpus * self.batch_size entries which exceeds ‘size’. Setting this flag to False will cause the iterator to return exactly ‘size’ entries.
last_batch_policy (optional, default = LastBatchPolicy.FILL) – What to do with the last batch when there are not enough samples in the epoch to fully fill it. See
nvidia.dali.plugin.base_iterator.LastBatchPolicy()
. BothFILL
andPARTIAL
would return a full batch but thepad
property value of the returned array would differ.last_batch_padded (bool, optional, default = False) – Whether the last batch provided by DALI is padded with the last sample or it just wraps up. In the conjunction with
last_batch_policy
it tells if the iterator returning last batch with data only partially filled with data from the current epoch is dropping padding samples or samples from the next epoch (it doesn’t literally drop but setspad
field of ndarray so the following code could use it to drop the data). If set toFalse
next epoch will end sooner as data from it was consumed but dropped. If set to True next epoch would be the same length as the first one. For this to happen, the option pad_last_batch in the reader needs to be set to True as well. It is overwritten when reader_name argument is providedprepare_first_batch (bool, optional, default = True) – Whether DALI should buffer the first batch right after the creation of the iterator, so one batch is already prepared when the iterator is prompted for the data
Example
With the data set
[1,2,3,4,5,6,7]
and the batch size 2:last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = True -> last batch =
[7, 7]
and MXNet array property.pad=1
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = False -> last batch =
[7, 1]
and MXNet array property.pad=1
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = True -> last batch =
[7, 7]
and MXNet array property.pad=0
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = False -> last batch =
[7, 1]
and MXNet array property.pad=0
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = True -> last batch =
[5, 6]
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = False -> last batch =
[5, 6]
, next iteration will return[2, 3]
-
getdata
()¶ Get data of current batch.
- Returns
The data of the current batch.
- Return type
list of NDArray
-
getindex
()¶ Get index of the current batch.
- Returns
index – The indices of examples in the current batch.
- Return type
numpy.array
-
getlabel
()¶ Get label of the current batch.
- Returns
The label of the current batch.
- Return type
list of NDArray
-
getpad
()¶ Get the number of padding examples in the current batch.
- Returns
Number of padding examples in the current batch.
- Return type
int
-
iter_next
()¶ Move to the next batch.
- Returns
Whether the move is successful.
- Return type
boolean
-
next
()¶ Returns the next batch of data.
-
reset
()¶ Resets the iterator after the full epoch. DALI iterators do not support resetting before the end of the epoch and will ignore such request.
-
property
size
¶
-
class
nvidia.dali.plugin.mxnet.
DALIGluonIterator
(pipelines, size=-1, reader_name=None, output_types=None, auto_reset=False, fill_last_batch=None, last_batch_padded=False, last_batch_policy=<LastBatchPolicy.FILL: 0>, prepare_first_batch=True)¶ General DALI iterator for MXNet with Gluon API. It can return any number of outputs from the DALI pipeline in the form of per GPU tuples. These tuples consisting of NDArrays (for outputs marked as DALIGluonIterator.DENSE_TAG) and list of NDArrays (for output marked as DALIGluonIterator.SPARSE_TAG).
- Parameters
pipelines (list of nvidia.dali.Pipeline) – List of pipelines to use
size (int, default = -1) – Number of samples in the shard for the wrapped pipeline (if there is more than one it is a sum) Providing -1 means that the iterator will work until StopIteration is raised from the inside of iter_setup(). The options last_batch_policy and last_batch_padded don’t work in such case. It works with only one pipeline inside the iterator. Mutually exclusive with reader_name argument
reader_name (str, default = None) – Name of the reader which will be queried to the shard size, number of shards and all other properties necessary to count properly the number of relevant and padded samples that iterator needs to deal with. It automatically sets last_batch_policy to PARTIAL when the FILL is used, and last_batch_padded accordingly to match the reader’s configuration
output_types (list of str, optional, default = None) – List of tags indicating whether the pipeline(s) output batch is uniform (all the samples have the same size) or not. Batch output marked as the former will be returned as a single NDArray, the latter will be returned as a list of NDArray. Must be either DALIGluonIterator.DENSE_TAG or DALIGluonIterator.SPARSE_TAG. Length of output_types must match the number of output of the pipeline(s). If not set, all outputs are considered to be marked with DALIGluonIterator.DENSE_TAG.
auto_reset (string or bool, optional, default = False) –
Whether the iterator resets itself for the next epoch or it requires reset() to be called explicitly.
It can be one of the following values:
"no"
,False
orNone
- at the end of epoch StopIteration is raised and reset() needs to be called"yes"
or"True"
- at the end of epoch StopIteration is raised but reset() is called internally automatically
fill_last_batch (bool, optional, default = None) –
Deprecated Please use
last_batch_policy
insteadWhether to fill the last batch with data up to ‘self.batch_size’. The iterator would return the first integer multiple of self._num_gpus * self.batch_size entries which exceeds ‘size’. Setting this flag to False will cause the iterator to return exactly ‘size’ entries.
last_batch_policy (optional, default = LastBatchPolicy.FILL) – What to do with the last batch when there are not enough samples in the epoch to fully fill it. See
nvidia.dali.plugin.base_iterator.LastBatchPolicy()
last_batch_padded (bool, optional, default = False) – Whether the last batch provided by DALI is padded with the last sample or it just wraps up. In the conjunction with
last_batch_policy
it tells if the iterator returning last batch with data only partially filled with data from the current epoch is dropping padding samples or samples from the next epoch (it doesn’t literally drop but setspad
field of ndarray so the following code could use it to drop the data). If set toFalse
next epoch will end sooner as data from it was consumed but dropped. If set to True next epoch would be the same length as the first one. For this to happen, the option pad_last_batch in the reader needs to be set to True as well. It is overwritten when reader_name argument is providedprepare_first_batch (bool, optional, default = True) – Whether DALI should buffer the first batch right after the creation of the iterator, so one batch is already prepared when the iterator is prompted for the data
Example
With the data set
[1,2,3,4,5,6,7]
and the batch size 2:last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = True -> last batch =
[7]
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.PARTIAL, last_batch_padded = False -> last batch =
[7]
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = True -> last batch =
[7, 7]
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.FILL, last_batch_padded = False -> last batch =
[7, 1]
, next iteration will return[2, 3]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = True -> last batch =
[5, 6]
, next iteration will return[1, 2]
last_batch_policy = LastBatchPolicy.DROP, last_batch_padded = False -> last batch =
[5, 6]
, next iteration will return[2, 3]
-
getdata
()¶ Get data of current batch.
- Returns
The data of the current batch.
- Return type
list of NDArray
-
getindex
()¶ Get index of the current batch.
- Returns
index – The indices of examples in the current batch.
- Return type
numpy.array
-
getlabel
()¶ Get label of the current batch.
- Returns
The label of the current batch.
- Return type
list of NDArray
-
getpad
()¶ Get the number of padding examples in the current batch.
- Returns
Number of padding examples in the current batch.
- Return type
int
-
iter_next
()¶ Move to the next batch.
- Returns
Whether the move is successful.
- Return type
boolean
-
next
()¶ Returns the next batch of data.
-
reset
()¶ Resets the iterator after the full epoch. DALI iterators do not support resetting before the end of the epoch and will ignore such request.
-
property
size
¶
-
nvidia.dali.plugin.mxnet.
feed_ndarray
(dali_tensor, arr, cuda_stream=None)¶ Copy contents of DALI tensor to MXNet’s NDArray.
- Parameters
dali_tensor (nvidia.dali.backend.TensorCPU or nvidia.dali.backend.TensorGPU) – Tensor from which to copy
arr (mxnet.nd.NDArray) – Destination of the copy
cuda_stream (cudaStream_t handle or any value that can be cast to cudaStream_t.) – CUDA stream to be used for the copy (if not provided, an internal user stream will be selected) In most cases, using the default internal user stream or stream 0 is expected.
-
nvidia.dali.plugin.mxnet.
get_mx_array
(shape, ctx=None, dtype=None)¶