nvidia.dali.fn.python_function#

nvidia.dali.fn.python_function(*input, function, batch_processing=False, bytes_per_sample_hint=[0], num_outputs=1, output_layouts=None, preserve=False, seed=-1, device=None, name=None)#

Executes a Python function.

This operator can be used to execute custom Python code in the DALI pipeline. The function receives the data from DALI as NumPy arrays in case of CPU operators or as CuPy arrays for GPU operators. It is expected to return the results in the same format. For a more universal data format, see nvidia.dali.fn.dl_tensor_python_function(). The function should not modify input tensors.

Warning

This operator is not compatible with TensorFlow integration.

Warning

When the pipeline has conditional execution enabled, additional steps must be taken to prevent the function from being rewritten by AutoGraph. There are two ways to achieve this:

  1. Define the function at global scope (i.e. outside of pipeline_def scope).

  2. If function is a result of another “factory” function, then the factory function must be defined outside pipeline definition function and decorated with @do_not_convert.

More details can be found in @do_not_convert documentation.

This operator allows sequence inputs and supports volumetric data.

This operator will not be optimized out of the graph.

Supported backends
  • ‘cpu’

  • ‘gpu’

Parameters:

__input_[0..255] (TensorList, optional) – This function accepts up to 256 optional positional inputs

Keyword Arguments:
  • function (object) –

    A callable object that defines the function of the operator.

    Warning

    The function must not hold a reference to the pipeline in which it is used. If it does, a circular reference to the pipeline will form and the pipeline will never be freed.

  • batch_processing (bool, optional, default = False) –

    Determines whether the function is invoked once per batch or separately for every sample in the batch.

    If set to True, the function will receive its arguments as lists of NumPy or CuPy arrays, for CPU and GPU backend, respectively.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • num_outputs (int, optional, default = 1) – Number of outputs.

  • output_layouts (layout str or list of layout str, optional) –

    Tensor data layouts for the outputs.

    This argument can be a list that contains a distinct layout for each output. If the list has fewer than num_outputs elements, only the first outputs have the layout set and the rest of the outputs have no layout assigned.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.