Ecosystem
NeMo Relay is the agent execution runtime layer in the NVIDIA NeMo ecosystem. It does not replace an agent framework, model provider, guardrail authoring system, or deployment platform. Instead, it gives those systems one shared way to model execution scopes, lifecycle events, middleware, plugins, adaptive behavior, and observability around tool and LLM calls.
Use this page to understand where NeMo Relay fits:
- Inside the NVIDIA NeMo software stack
- Inside agent frameworks, harnesses, and provider adapters
- Across the Rust, Python, Node.js, Go, WebAssembly, and C FFI surfaces in this repository
How NeMo Relay Fits In The NVIDIA NeMo Ecosystem
The NVIDIA NeMo ecosystem spans model development, agent construction, guardrailing, inference, optimization, and runtime operations. NeMo Relay has a narrower responsibility: it is the portable execution substrate that agent systems can call when actual work crosses a scope, tool, or model boundary.
In practical terms, NeMo Relay answers a different question than higher-level agent products. A framework asks, “What should the agent do next?” NeMo Relay asks, “When the agent does work, which scope owns it, which middleware applies, what events are emitted, and which subscribers can consume the result?”
The dotted path matters. An application or custom harness can call NeMo Relay directly without adopting a higher-level framework. A framework integration can also call NeMo Relay on behalf of application code when the framework owns the tool or provider boundary.
How NeMo Relay Fits Agent Frameworks And Harnesses
The agent framework and harness landscape is intentionally mixed. A team might use NeMo Agent Toolkit, LangChain, LangGraph, an internal orchestration layer, a provider SDK, or direct application code. NeMo Relay is designed to meet those systems at stable execution boundaries instead of requiring one framework shape.
Prefer a managed execution wrapper when a framework exposes a stable callback that NeMo Relay can own. Use explicit lifecycle calls or standalone helpers when the framework owns the callback internally but exposes reliable start, finish, or request transformation hooks.
This lets NeMo Relay provide consistent runtime semantics without forcing a framework migration:
- Applications keep their existing agent orchestration model
- Framework adapters preserve public behavior and callback signatures
- Non-serializable provider objects stay in framework-owned storage
- NeMo Relay receives JSON-compatible payloads for middleware and events
- Subscribers see a consistent scope, tool, and LLM event stream across integrations
Related Topics
Use these links to continue into adjacent concepts and workflows.