{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Parallel External Source\n", "\n", "In this tutorial we will show you how to enable parallel mode in `external_source` operator, allowing the `source` to be executed concurrently by Python worker processes. Doing so can reduce the execution time of each iteration by allowing source to run in the background, unblocking the main Python thread. Not every usage of `external_source` can be parallelized straight away - the `source` argument is subject to a set of restrictions and there is additional configuration required." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Accepted `source`\n", "Depending on a `batch` parameter, the `source` provided to parallel External Source is expected to either return batches or single samples. In the sample mode, DALI takes care of combining samples in batches and has more opportunities to parallelize computation of the input data. Therefore, the sample mode is a preferred way to run parallel External Source. Parallel External Source in the sample mode places following requirements on the source parameter:\n", "\n", "1. `source` must be a callable: function or object. \n", "2. `source` callback must accept one argument: nvidia.dali.types.SampleInfo - indicating the index of requested sample.\n", "3. Data returned by the callback must be a CPU array (or tuple/list of them).\n", "\n", "Following paragraphs will explain the reasoning behind those requirements and show examples of using the parallel External Source.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Principle of Operation\n", "\n", "Before every iteration, DALI External Source operator queries its `source` parameter for new data to pass it further for processing in the pipeline. The time necessary to obtain data from `source` by calling it (when the `source` is callable) or calling `next(source)` (in case of iterabe) can be significant and it can impact the time to process an iteration - especially as it's a blocking operation in the main Python thread.\n", "\n", "Setting `parallel=True` for an `external_source` node indicates to the pipeline to run the `source` in Python worker processes started by DALI. The worker processes are bound to pipeline and shared by all parallel external sources in that pipeline.\n", "Each of those workers keeps a **copy** of the `source` callback/object. The workers are separate processes, so keep in mind that they do not share any global state, only a copy of what was specified before starting them.\n", "\n", "In the sample mode, each process can request a particular sample from its copy of `source` callback by invoking it with [SampleInfo](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/data_types.html#sampleinfo) object containing requested sample's index. \n", "DALI pipeline splits computation of samples needed for the next batches between the workers ahead of time and collects the data back for use in current iteration.\n", "\n", "In the batch mode, DALI cannot request a particular sample to be returned, thus the benefits of parallelization are limited compared to the sample mode. If the `source` is a callable that accepts [BatchInfo](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/data_types.html#nvidia.dali.types.BatchInfo), a few batches can be prefetched in parallel. In case of iterables, the only benefit is running the iterable in a separate process.\n", "\n", "Because the parallel sample mode can provide the biggest speed up, we present how to adapt iterable source to run in parallel sample mode." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "