Custom LLM Frameworks for the NVIDIA NeMo Guardrails library
The NVIDIA NeMo Guardrails library has two layers of LLM extensibility: providers and frameworks. Most users only need the provider layer. This guide is for the smaller set of cases that need to replace the framework layer itself.
The Two-Layer Model
A provider is a name a user types as engine: in config.yml: a label your framework dispatches on. In DefaultFramework, openai, nim, and ollama are provider names that all dispatch to the same OpenAIChatModel runtime. They differ only in default base URLs and small per-provider conventions. In LangChainFramework, each provider name dispatches to its own LangChain class. Your framework decides whether multiple provider names share one runtime or each name has its own. Adding a provider is the right move when you want to plug in one new backend and the surrounding framework’s behavior is fine. For details, refer to Custom LLM Providers and Custom LLM Model.
A framework owns the entire LLM stack: how models are constructed, how providers are looked up, and how resources are released at shutdown. Adding a framework is the right move when you want to replace the entire stack (for example, route everything through LiteLLM, a proprietary in-house orchestrator, or a service mesh).
In practice almost every customization is a provider. A custom framework is reserved for the cases where you are replacing more than one engine and you need shared lifecycle management across them.
The LLMFramework Contract
The protocol is nemoguardrails.types.LLMFramework and is @runtime_checkable, so callers can verify a framework with isinstance(instance, LLMFramework). As a Python Protocol, it expresses a contract. Nothing prevents you from passing an object that duck-types most of it, but the rest of the NVIDIA NeMo Guardrails library assumes both invariants below hold:
- The registered object structurally matches the
LLMFrameworkprotocol (the four methods and their signatures listed below). - Its
resetattribute is anasynccoroutine function. The registry awaits it directly during test teardown.
A custom framework implements four methods.
create_model
Called once per models: entry in config.yml when LLMRails builds its task models. model_name is the value of model:, provider_name is the value of engine:, and model_kwargs carries everything from the entry’s parameters block plus a few platform keys like mode. Your framework decides what provider_name means. Typically, you use it to dispatch to a specific LLMModel class or to pick provider-specific defaults. Return any object that implements LLMModel. For details, refer to Custom LLM Model.
The framework owns construction. It can cache and reuse expensive resources, such as HTTP clients, gRPC channels, and auth tokens. It can also inject defaults for headers, timeouts, and retries, or short-circuit on a registered custom provider. Review DefaultFramework and LangChainFramework for two contrasting implementations.
register_provider
Called by user code (usually from a config.py) to add a custom class your framework should dispatch to. Implementations typically just record the class in an in-memory dict; create_model then checks that dict before falling back to its built-in dispatch.
get_provider_names
Returns the list of provider names this framework knows about, including built-ins and anything registered at runtime. Used by tooling (nemoguardrails find_providers) and for debugging.
reset
reset is called at process or test boundaries to release framework-owned resources. It must:
- Close any pooled HTTP clients, gRPC channels, file handles, or database connections.
- Clear any registered-provider state if you want a clean slate (some frameworks like
DefaultFrameworkseparateaclosefromclear_providersand call both fromreset; others may want to keep registrations). - Be idempotent: calling
resettwice in a row must not raise. - Be safe to call from a running event loop. The registry awaits it directly with
_areset_frameworks.
After reset, the instance must remain usable. New resources are constructed lazily on the next create_model call.
Today reset is invoked only by the test suite; the runtime does not call it on nemoguardrails server shutdown. Implement it for test isolation, not for production cleanup.
Minimal Working Example
The example below is fully self-contained and runs end-to-end without any
external dependencies. The model is an “echo” implementation that returns a
fixed string for every prompt. Swap in real HTTP calls or SDK invocations after
you verify that the registration and dispatch path works. Refer to
custom-llm-model.md for the canonical httpx-based pattern.
Create a config directory my_config/ next to your smoke-test script with
two files:
my_config/config.py:
my_config/config.yml:
Trying it out
Run a smoke test from the parent directory of my_config/. LLMRails
imports config.py automatically, which triggers the register_framework
and set_default_framework calls at the bottom of that file:
If the smoke test prints echo from echo, the framework is wired up. From
there, replace EchoLLMModel.generate_async and stream_async with real
backend calls. Refer to custom-llm-model.md.
After register_framework("my", MyFramework()), the framework is selectable in three ways:
-
Process-wide default at import time. Set the environment variable before importing the NVIDIA NeMo Guardrails library:
The registry reads
NEMOGUARDRAILS_LLM_FRAMEWORKat module load and uses it as the active framework name. -
Programmatic flip in
config.py. Callset_default_framework("my")after registering. All subsequentLLMRailsconstructions use it. -
Targeted dispatch. If you want different frameworks for different model entries, route directly with
framework.create_modelin your own initialization code (advanced; not the standard path).
config.yml entries do not name the framework; they name a provider. The framework is implicit in whichever one is active.
Reference Implementations
Review these production-grade frameworks:
nemoguardrails/llm/frameworks/default.py:DefaultFramework. PoolsOpenAICompatibleClientinstances keyed on(base_url, api_key, timeouts, headers, query). Splits lifecycle intoaclose(HTTP teardown),clear_providers(registry teardown), andreset(both, used in tests).nemoguardrails/integrations/langchain/llm_adapter.py:LangChainFramework. Defers tonemoguardrails.integrations.langchain.providersfor registration, callsinit_langchain_modelfor construction, wraps the result inLangChainLLMAdapter. Has a no-opresetbecause the LangChain side has no pooled state of its own.nemoguardrails/llm/frameworks/registry.py:register_framework,get_framework,set_default_framework,get_default_framework,_areset_frameworks. Read this to understand the environment variable, lazy lookup, and registration behavior.
Failure Modes
Registering a provider before any framework is active
register_provider from nemoguardrails.llm.providers resolves the active framework with get_default_framework() and calls framework.register_provider on it. The registry has a built-in default framework that is constructed lazily on first access, so this almost always works without explicit setup. The failure mode appears only when the user sets NEMOGUARDRAILS_LLM_FRAMEWORK to a name that has not been registered yet:
The fix is simple: register the framework before any provider, or keep NEMOGUARDRAILS_LLM_FRAMEWORK unset until after register_framework has run.
Unknown framework on activation
The two built-in names always appear in this hint because the registry knows them by default. If you are working with only your own framework, register it first then call set_default_framework.
Best Practices
- Treat
resetas a hard contract, not a hint. Test it. Pooled HTTP connections that survive across tests cause surprising flakes elsewhere. - Prefer composition over inheritance.
MyFrameworkdoes not need to subclassDefaultFramework. The protocol is small enough to implement from scratch. - Pool HTTP clients on the framework when multiple
models:entries share a backend.create_modelruns once per entry atLLMRailsstartup, so a model can safely build its own client. When two entries point at the same backend, only the framework can deduplicate them.DefaultFramework._get_or_create_clientkeys clients by(base_url, api_key, ...)for exactly this case. - Do not import LangChain in a default-framework-style implementation. The whole point of swapping the framework layer is to avoid pulling in dependencies you do not need. Keep your imports tight.
- Document your framework’s provider taxonomy.
get_provider_namesis whatnemoguardrails find_providersshows users.
Related Topics
- Custom LLM Model - Implement the
LLMModelprotocol that your framework constructs. - Custom LLM Providers - LangChain
BaseLLM/BaseChatModelproviders (usesengine: langchain). - Init Function - Where
register_frameworkandset_default_frameworkcalls usually go. - Configuration Reference -
config.ymlschema and theengine,model, andparametersfields.