aistore.sdk.obj.content_iterator.parallel
aistore.sdk.obj.content_iterator.parallel
Parallel object download using forked workers and shared memory.
Provides two download modes:
-
read_all(): workers write directly into a single full-sizeParallelBuffer(shared memory). The caller receives a memoryview with no additional copy on the return path. -
create_iter(): workers write into a fixed-size ring buffer in shared memory; the main process yields chunks in order. See thecreate_iterdocstring for the detailed data flow.
Module Contents
Classes
Functions
Data
API
Bases: BaseContentIterProvider
Parallel content iterator backed by a shared-memory ring buffer.
Splits the object into fixed-size byte ranges and fetches them in parallel
using ProcessPoolExecutor. Workers write directly into a bounded ring
buffer of num_workers slots. Chunks are yielded to the caller in order.
Parameters:
Client for accessing contents of an individual object.
Size of each chunk of data yielded. If None, will attempt to use optimal chunk size from HeadObjectV2 response.
Number of concurrent workers for fetching chunks.
Get the number of concurrent workers.
Return a list of (start, end) byte ranges covering the object from offset.
Download the entire object into dst (direct-to-destination mode).
Each worker owns a disjoint byte range within dst — no synchronization beyond waiting for all futures.
Parameters:
A ParallelBuffer of at least _object_size bytes,
typically created and owned by read_all().
Return the ‘fork’ multiprocessing context, or raise RuntimeError on unsupported platforms.
Yield object content in order using a sliding-window ring buffer.
Keeps at most num_workers range-reads in flight at any time.
The ring buffer lives in /dev/shm::
/dev/shm ┌──────────┬──────────┬──────────┐ │ [slot 0] │ [slot 1] │ [slot 2] │ num_slots x slot_size bytes └──────────┴──────────┴──────────┘ chunks: [ chunk 0 ][ chunk 1 ][ chunk 2 ][ chunk 3 ].. slot = chunk_idx % num_slots
Main Process Forked Workers ───────────── ────────────────────────────── [slot 0] [slot 1] [slot 2] submit chunk 0 ───────────────> write slot 0 submit chunk 1 ─────────────────────────────> write slot 1 submit chunk 2 ───────────────────────────────────────────> write slot 2 future[0].result() ← done wait_slot(0) ← event set yield read_slot(0) submit chunk 3 ───────────────> write slot 0 (reuse) …
Parameters:
Starting byte offset. Defaults to 0.
Download the entire object into shared memory and return a :class:ParallelBuffer.
Workers write directly into a SharedMemory segment; the caller
receives a memoryview of that segment with no extra allocation or copy.
The caller must call result.close() (or use it as a context
manager) to release the shared memory when done.
Per-worker state populated by _init_worker before any task is dispatched.
This instance lives at module scope so each forked worker inherits it via fork’s memory copy. _init_worker populates it once per worker; _fetch_chunk reads it on every invocation without re-serialization.
Download byte range [start, end) and write into worker_state.shm at offset.
Parameters:
First byte of the range to fetch from the object.
One past the last byte (exclusive).
Byte offset in the shm segment to write into. Ring-buffer
callers pass slot * chunk_size; direct callers pass start.
If >= 0, signals slot_ready[slot_idx] on completion
(ring-buffer mode). Pass -1 to skip (direct mode).
ProcessPoolExecutor initializer hook — runs once per worker after forking.
Populates worker_state before any _fetch_chunk task is dispatched so the
worker is fully initialized before it does any work.
Parameters:
Client for the worker to use to download chunks.
Name of the shared memory segment to attach to.
Per-slot events for ring-buffer mode. Omit for direct-to-destination callers.