> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/guardrails/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/guardrails/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/guardrails/_mcp/server.

# Release Notes

> Review new features, breaking changes, and fixed issues for each release.

The following sections summarize and highlight the changes for each release.
For a complete record of changes in a release, refer to the
[CHANGELOG.md](https://github.com/NVIDIA-NeMo/Guardrails/blob/develop/CHANGELOG.md) in the GitHub repository.

***

<a id="v0-22-0" />

## 0.22.0

<a id="v0-22-0-features" />

### Key Features

* LangChain is now optional. `pip install nemoguardrails` no longer pulls
  LangChain or any provider-specific `langchain-*` packages. The NVIDIA NeMo
  Guardrails library ships with a built-in client that talks to
  OpenAI-compatible endpoints directly over `httpx`. Engines whose API isn't
  OpenAI-compatible (Anthropic, Cohere, Vertex AI, Google Generative AI,
  in-process Hugging Face, TensorRT-LLM, and others) keep working through
  LangChain when you opt in with `NEMOGUARDRAILS_LLM_FRAMEWORK=langchain` and
  install the matching provider package. Most 0.21 configurations keep working
  unchanged; some shapes need a YAML rewrite. For recipes, refer to
  [Migrating to v0.22.0](/reference/0-22), the
  [Supported LLMs](/about-nemo-guardrails-library/supported-llms) matrix, and
  [Model Configuration](/configure-guardrails/yaml-schema/model-configuration).

* OpenAI-compatible service support is improved in the default framework.
  The default framework now supports OpenAI-compatible providers directly,
  includes native Azure OpenAI support through `engine: azure` and
  `engine: azure_openai`, and documents how to migrate provider-specific
  LangChain parameters to the new `base_url`-based configuration shape. For
  more information, refer to
  [Migrating to v0.22.0](/reference/0-22),
  [Model Configuration](/configure-guardrails/yaml-schema/model-configuration),
  [Configuration Reference](/configure-guardrails/configuration-reference), and
  [Using Docker](/more-deployment-options/using-docker).

* `IORails` adds streaming support, reasoning-model support, and speculative
  generation support. The optimized input and output rails engine now supports
  streaming output rails, `stream_async()` integration in chat and server flows,
  non-streaming and streaming reasoning-model responses, and speculative
  generation for non-streaming `generate_async()` calls. For more information,
  refer to
  [Parallel Rails](/configure-guardrails/yaml-schema/guardrails-configuration#iorails-engine),
  [Streaming](/run-guardrailed-inference/using-python-apis/streaming), and
  [Speculative Generation](/configure-guardrails/yaml-schema/guardrails-configuration#speculative-generation).

* `IORails` adds OpenTelemetry observability with logging, tracing, and
  metrics support. The documentation covers OTLP setup, Prometheus client
  installation, request-level and token-level metrics, and the recommended
  `Guardrails` entry point for the optimized input and output rails engine. For
  more information, refer to
  [Observability](/observability/observability),
  [OpenTelemetry Logs](/observability/tracing/opentelemetry-logs),
  [OpenTelemetry Tracing](/observability/tracing/opentelemetry-integration),
  [OpenTelemetry Metrics](/observability/metrics/opentelemetry-integration),
  [Enable Metrics](/observability/metrics/enable-metrics), and the
  [Metrics Reference](/observability/metrics/reference).

* Anonymous usage reporting is documented with clear privacy boundaries and
  opt-out controls. The telemetry reference explains what fields are collected,
  what data is excluded, how local audit files work, and how to opt out with
  `NEMO_GUARDRAILS_NO_USAGE_STATS=1`, `DO_NOT_TRACK=1`, or the
  `~/.config/nemoguardrails/do_not_track` file. For more information, refer to
  [Telemetry](/resources/telemetry).

<a id="v0-22-0-breaking-changes" />

### Breaking Changes

* Moved `AsyncWorkQueue` from the top-level `Guardrails` object to
  `IORails`. This removes buffering for non-streaming `LLMRails` requests when
  you use the top-level `Guardrails` object. This change only affects existing implementations that
  set `NEMO_GUARDRAILS_IORAILS_ENGINE=1` or instantiate `Guardrails` directly.

<a id="v0-22-0-enhancements" />

### Enhancements

* The GLiNER PII connector documentation and notebook are updated for the new
  GLiNER PII NIM. The examples cover both remote and local deployment modes
  and API key configuration for the connector. For more information, refer to
  [GLiNER](/configure-guardrails/guardrail-catalog/third-party/gliner) and
  [PII Detection](/configure-guardrails/guardrail-catalog/pii-detection).

* Public extension points for LLM integration. Two new protocols, `LLMModel`
  and `LLMFramework` in `nemoguardrails.types`, let you plug in a custom
  backend or a whole alternative framework without touching internals. For more
  information, refer to
  [Custom LLM Models](/configure-guardrails/custom-initialization/custom-llm-model)
  and
  [Custom LLM Frameworks](/configure-guardrails/custom-initialization/custom-llm-framework).

* Public testing surface. The `nemoguardrails.testing` module exposes
  `FakeLLMModel`, `TestChat`, and pytest fixtures for writing tests against a
  guardrails configuration without calling a real model.

<a id="v0-22-0-doc-and-behavior-fixes" />

### Documentation and Behavior Fixes

* Fixed the example query and expected output in the Guardrails Agent
  Middleware integration guide so the example matches the configured blocked
  response behavior. For more information, refer to
  [Guardrails Agent Middleware](/integration-with-third-party-libraries/langchain/agent-middleware).
* A warning about a missing main LLM is now emitted only when generation is
  actually attempted and the generation path needs the main LLM. Check-only
  configurations no longer emit the warning during initialization. For more
  information, refer to
  [Check Messages](/run-guardrailed-inference/using-python-apis/check-messages).
* Fixed issues in the [Colang 1.0 Hello World tutorial](/configure-guardrails/colang/colang-1/tutorials/1-hello-world) and companion notebook.

***

## Previous Release Notes

* [0.21.0](https://docs.nvidia.com/nemo/guardrails/0.21.0/release-notes.html)
* [0.20.0](https://docs.nvidia.com/nemo/guardrails/0.20.0/release-notes.html)
* [0.19.0](https://docs.nvidia.com/nemo/guardrails/0.19.0/release-notes.html)
* [0.18.0](https://docs.nvidia.com/nemo/guardrails/0.18.0/release-notes.html)
* [0.17.0](https://docs.nvidia.com/nemo/guardrails/0.17.0/release-notes.html)
* [0.16.0](https://docs.nvidia.com/nemo/guardrails/0.16.0/release-notes.html)
* [0.15.0](https://docs.nvidia.com/nemo/guardrails/0.15.0/release-notes.html)
* [0.14.1](https://docs.nvidia.com/nemo/guardrails/0.14.1/release-notes.html)
* [0.14.0](https://docs.nvidia.com/nemo/guardrails/0.14.0/release-notes.html)