Per-Stage Runtime Environments
Per-Stage Runtime Environments
Run pipeline stages with different Python package versions in the same pipeline. Each stage can declare a runtime_env that tells Ray to create an isolated virtualenv for that stage’s workers, so incompatible library versions coexist without conflicts.
Overview
Some curation pipelines require stages that depend on different versions of the same library. For example, one stage might need transformers==4.40.0 for a specific model checkpoint, while another stage needs transformers==4.45.0 for a newer model. Without isolation, these stages cannot coexist in the same pipeline.
Per-stage runtime environments solve this by using Ray’s native runtime_env support. When a stage declares a runtime_env, Ray creates and caches an isolated virtualenv under /tmp/ray/session_latest/runtime_resources/pip/<hash>/virtualenv. Each unique dependency set gets its own cached virtualenv, reused for the lifetime of the Ray session. No driver-side virtualenv creation or PYTHONPATH manipulation is needed.
Usage
Declare dependencies on a stage class
Set runtime_env as a class variable on your ProcessingStage subclass:
Override at instantiation time
Use with_() to change the runtime environment for a specific pipeline without modifying the stage class:
Use uv as the package installer
Ray also supports uv as the package installer inside worker virtualenvs. Use the "uv" key instead of "pip" for faster installs:
Both "pip" and "uv" keys work regardless of which package manager your local environment uses. The key only controls which installer Ray uses inside the worker virtualenv.
Backend Support
Per-stage runtime environments work with all three execution backends:
Behavior
- Additive isolation: Isolated virtualenvs are cloned from the base environment. Packages installed in the base environment (such as NeMo Curator and its dependencies) remain importable in isolated workers unless explicitly overridden.
- Caching: Ray caches each unique
runtime_envspecification. The first task dispatched to a newruntime_envtriggers virtualenv creation; subsequent tasks reuse the cached environment. - No runtime_env: Stages that do not set
runtime_env(the default) run in the base Python environment with no isolation overhead.
Container Setup
The NeMo Curator container image creates its virtualenv with uv venv --seed, which ensures that pip is available inside the venv. This is required because Ray’s pip-based runtime_env plugin clones the current virtualenv and needs pip to install stage-specific packages in the clone.
If you are running outside the official container, make sure your Python environment has pip available:
Example: Multi-Version Pipeline
This example runs three stages in a single pipeline, each seeing a different version of the packaging library. It uses RecordPackagingVersionStage, a test stage from PR #1623 that records the packaging library version visible to each worker:
Limitations
- First-task latency: The first task dispatched to a stage with a new
runtime_envincurs virtualenv creation time. Subsequent tasks reuse the cached environment. - Disk usage: Each unique
runtime_envcreates a separate virtualenv on each worker node. Monitor disk space under/tmp/ray/for large clusters with many distinct environments. - Session scope: Cached virtualenvs are tied to the Ray session. Restarting Ray clears the cache.