> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/aistore/llms.txt.
> For full documentation content, see https://docs.nvidia.com/aistore/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/aistore/_mcp/server.

# aistore.sdk.obj.content_iterator.parallel

Parallel object download using forked workers and shared memory.

Provides two download modes:

* `read_all()`: workers write directly into a single full-size
  `ParallelBuffer` (shared memory).  The caller receives a memoryview
  with no additional copy on the return path.

* `create_iter()`: workers write into a fixed-size ring buffer in
  shared memory; the main process yields chunks in order.  See the
  `create_iter` docstring for the detailed data flow.

## Module Contents

### Classes

| Name                                                                                                    | Description                                                                 |
| ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| [`ParallelContentIterProvider`](#aistore-sdk-obj-content_iterator-parallel-ParallelContentIterProvider) | Parallel content iterator backed by a shared-memory ring buffer.            |
| [`WorkerState`](#aistore-sdk-obj-content_iterator-parallel-WorkerState)                                 | Per-worker state populated by \_init\_worker before any task is dispatched. |

### Functions

| Name                                                                      | Description                                                                       |
| ------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| [`_fetch_chunk`](#aistore-sdk-obj-content_iterator-parallel-_fetch_chunk) | Download byte range `[start, end)` and write into `worker_state.shm` at `offset`. |
| [`_init_worker`](#aistore-sdk-obj-content_iterator-parallel-_init_worker) | ProcessPoolExecutor initializer hook — runs once per worker after forking.        |

### Data

[`worker_state`](#aistore-sdk-obj-content_iterator-parallel-worker_state)

### API

```python
class aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider(
    client: aistore.sdk.obj.object_client.ObjectClient,
    chunk_size: typing.Optional[int],
    num_workers: int
)
```

**Bases:** [BaseContentIterProvider](/python/aistore/sdk/obj/content_iterator/base#aistore-sdk-obj-content_iterator-base-BaseContentIterProvider)

Parallel content iterator backed by a shared-memory ring buffer.

Splits the object into fixed-size byte ranges and fetches them in parallel
using `ProcessPoolExecutor`.  Workers write directly into a bounded ring
buffer of `num_workers` slots. Chunks are yielded to the caller in order.

**Parameters:**

Client for accessing contents of an individual object.

Size of each chunk of data yielded. If None,
will attempt to use optimal chunk size from HeadObjectV2 response.

Number of concurrent workers for fetching chunks.

Get the number of concurrent workers.

```python
aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider._build_chunk_ranges(
    offset: int
) -> typing.List[typing.Tuple[int, int]]
```

Return a list of `(start, end)` byte ranges covering the object from `offset`.

```python
aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider._fill_shm(
    dst: aistore.sdk.obj.content_iterator.buffer.ParallelBuffer
) -> None
```

Download the entire object into *dst* (direct-to-destination mode).

Each worker owns a disjoint byte range within *dst* — no
synchronization beyond waiting for all futures.

**Parameters:**

A `ParallelBuffer` of at least `_object_size` bytes,
typically created and owned by `read_all()`.

```python
aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider._get_fork_context()
```

staticmethod

Return the 'fork' multiprocessing context, or raise RuntimeError on unsupported platforms.

```python
aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider.create_iter(
    offset: int = 0
) -> typing.Generator[bytes, None, None]
```

Yield object content in order using a sliding-window ring buffer.

Keeps at most `num_workers` range-reads in flight at any time.
The ring buffer lives in `/dev/shm`::

/dev/shm  ┌──────────┬──────────┬──────────┐
│ \[slot 0] │ \[slot 1] │ \[slot 2] │  num\_slots x slot\_size bytes
└──────────┴──────────┴──────────┘
chunks:   \[ chunk 0 ]\[ chunk 1 ]\[ chunk 2 ]\[ chunk 3 ]..
slot = chunk\_idx % num\_slots

Main Process                       Forked Workers
─────────────                      ──────────────────────────────
\[slot 0]      \[slot 1]      \[slot 2]
submit chunk 0 ───────────────> write slot 0
submit chunk 1 ─────────────────────────────> write slot 1
submit chunk 2 ───────────────────────────────────────────> write slot 2
future\[0].result()  ← done
wait\_slot(0)        ← event set
yield read\_slot(0)
submit chunk 3 ───────────────> write slot 0  (reuse)
...

**Parameters:**

Starting byte offset. Defaults to 0.

```python
aistore.sdk.obj.content_iterator.parallel.ParallelContentIterProvider.read_all() -> aistore.sdk.obj.content_iterator.buffer.ParallelBuffer
```

Download the entire object into shared memory and return a :class:`ParallelBuffer`.

Workers write directly into a `SharedMemory` segment; the caller
receives a memoryview of that segment with no extra allocation or copy.
The caller **must** call `result.close()` (or use it as a context
manager) to release the shared memory when done.

```python
class aistore.sdk.obj.content_iterator.parallel.WorkerState(
    client: typing.Optional[aistore.sdk.obj.object_client.ObjectClient] = None,
    shm: typing.Optional[multiprocessing.shared_memory.SharedMemory] = None,
    slot_ready: typing.List[multiprocessing.Event] = list()
)
```

Dataclass

Per-worker state populated by \_init\_worker before any task is dispatched.

This instance lives at module scope so each forked worker inherits it via
fork's memory copy. \_init\_worker populates it once per worker; \_fetch\_chunk
reads it on every invocation without re-serialization.

```python
aistore.sdk.obj.content_iterator.parallel._fetch_chunk(
    start: int,
    end: int,
    offset: int,
    slot_idx: int = -1
) -> int
```

Download byte range `[start, end)` and write into `worker_state.shm` at `offset`.

**Parameters:**

First byte of the range to fetch from the object.

One past the last byte (exclusive).

Byte offset in the shm segment to write into. Ring-buffer
callers pass `slot * chunk_size`; direct callers pass `start`.

If >= 0, signals `slot_ready[slot_idx]` on completion
(ring-buffer mode). Pass -1 to skip (direct mode).

```python
aistore.sdk.obj.content_iterator.parallel._init_worker(
    client: aistore.sdk.obj.object_client.ObjectClient,
    shm_name: str,
    slot_ready: typing.Optional[typing.List[multiprocessing.Event]] = None
) -> None
```

ProcessPoolExecutor initializer hook — runs once per worker after forking.

Populates `worker_state` before any `_fetch_chunk` task is dispatched so the
worker is fully initialized before it does any work.

**Parameters:**

Client for the worker to use to download chunks.

Name of the shared memory segment to attach to.

Per-slot events for ring-buffer
mode. Omit for direct-to-destination callers.

```python
aistore.sdk.obj.content_iterator.parallel.worker_state = WorkerState()
```