Tool Calling

View as Markdown

The IORails engine can sit in front of a tool-using chat model and validate tool traffic in both directions. It forwards your tool definitions to the model unchanged and applies two structural rails:

  • Tool call validation checks every tool call the model emits against the tools you declared. A call must name an allowed tool and supply arguments that satisfy that tool’s JSON Schema.
  • Tool result validation checks the tool results your application returns to the model. Each result must link to a tool call the model previously made, name a consistent tool, and carry well-formed content.

Both rails are local checks. They do not make extra LLM or API calls. They run on every request and fail closed: a violation, a parsing error, or a malformed payload blocks the request rather than passing it through.

The IORails engine does not execute tools. Your application or agent harness executes a tool after the model requests it, then sends the result back on the next request. These rails validate the request and response boundary; they do not call tools themselves.

Tool-calling rails run only on the IORails engine, which is opt-in and experimental. They are not available on the default LLMRails engine. See Enable the IORails engine below. This feature supports only the OpenAI Chat Completions wire format (the openai and nim engines).

How Function Calling Works

Your application and the model use function calling as a multi-step loop. Your application, not the model, runs the function. The flow follows the OpenAI function-calling guide. During that flow, the IORails rails validate the tool traffic at two points.

A function is a tool you define with a JSON Schema, and tool names the broader category that also covers built-in and custom tools. When the model decides to use a tool, it returns a tool call identified by a tool_call_id, and your application answers it with a tool result that carries the same tool_call_id.

The end-to-end flow has five steps:

  1. Make a request to the model with tools it could call. You send the conversation along with your tool definitions.
  2. Receive a tool call from the model. The model responds with one or more tool calls instead of, or alongside, a text answer. IORails runs tool call validation here, before the tool calls reach your application.
  3. Execute code on the application side with input from the tool call. Your application runs the function and produces a result.
  4. Make a second request to the model with the tool output. You resend the conversation, including the assistant turn that made the calls and one tool result for each tool_call_id. IORails runs tool result validation here, before the model sees the results.
  5. Receive a final response from the model (or more tool calls). The model returns a final text answer, or more tool calls that repeat the loop from step 2.

How Tool-Calling Rails Fit the Pipeline

Tool-calling rails mirror the input and output rails, with one rail per direction:

RailConfig sectionFlow nameDirectionWhat it checks
Tool call validationrails.tool_outputtool call validationModel outputTool calls the model emits: allowed name plus schema-valid arguments.
Tool result validationrails.tool_inputtool result validationRequest inputTool results sent back to the model: linkage to a prior call, name consistency, well-formed content.

For each request, IORails runs the rails in the following order:

  1. Tool result validation on the incoming messages (alongside input rails).
  2. Input rails, then the main LLM call.
  3. Tool call validation on the tool calls the model returned.
  4. Output rails on the text response. A response that contains only tool calls and no text skips the text output rails.

When any rail blocks, IORails returns the refusal message I'm sorry, I can't respond to that. for a non-streaming request, and a guardrails_violation error payload for a streaming request.

Enable the IORails Engine

Tool-calling rails require the IORails engine. Enable it in one of two ways.

Set the NEMO_GUARDRAILS_IORAILS_ENGINE environment variable so the top-level LLMRails import resolves to the IORails-backed engine. This is the least invasive option for an existing application or the server:

1$ export NEMO_GUARDRAILS_IORAILS_ENGINE=1
1from nemoguardrails import LLMRails, RailsConfig
2
3config = RailsConfig.from_path("./config")
4rails = LLMRails(config)

Alternatively, construct the Guardrails facade directly, which uses the IORails engine by default:

1from nemoguardrails import Guardrails, RailsConfig
2
3config = RailsConfig.from_path("./config")
4rails = Guardrails(config)

If the configuration is not IORails-compatible, the Guardrails facade falls back to LLMRails and logs a warning. Pass require_iorails=True to raise instead of falling back. A configuration is IORails-compatible only when it uses Colang version 1.0, declares no custom llm argument, and uses only IORails-supported rails and flows.

Configure the Rails

Add the tool-calling flows to your config.yml. Each tool_* section accepts only its own flow name:

1models:
2 - type: main
3 engine: nim
4 model: meta/llama-3.3-70b-instruct
5
6rails:
7 tool_output:
8 flows:
9 - tool call validation
10 tool_input:
11 flows:
12 - tool result validation

You can enable either rail independently. To validate only the model’s tool calls, configure rails.tool_output and leave rails.tool_input empty.

The flow names are fixed and direction-specific. rails.tool_output accepts only tool call validation, and rails.tool_input accepts only tool result validation. A misdirected, unknown, or duplicated tool flow name makes the configuration ineligible for IORails: the Guardrails facade silently falls back to LLMRails, and none of the IORails tool rails run. A typo in a flow name disables the tool rails you configured, with only a log warning to signal it. Pass require_iorails=True to the Guardrails constructor to raise instead of falling back silently.

IORails accepts the parallel field on the tool_output and tool_input sections for symmetry with other rails, but does not honor it for tool rails and emits a warning if you set it to true. Tool rails are local checks with no I/O to overlap, so they always run sequentially.

Declare Tools

The rails validate against the tools you declare on the request. Tool definitions use the provider-native OpenAI Chat Completions shape, and IORails forwards them to the model unchanged.

Declare tools per request in options.llm_params:

1tools = [
2 {
3 "type": "function",
4 "function": {
5 "name": "get_weather",
6 "description": "Get the weather for a city.",
7 "parameters": {
8 "type": "object",
9 "properties": {"city": {"type": "string"}},
10 "required": ["city"],
11 },
12 },
13 },
14]
15
16response = await rails.generate_async(
17 messages=[{"role": "user", "content": "What's the weather in Paris?"}],
18 options={"llm_params": {"tools": tools, "tool_choice": "auto"}},
19)

When the model requests a tool, generate_async returns an assistant message that carries the tool calls in OpenAI shape:

1{
2 "role": "assistant",
3 "content": None,
4 "tool_calls": [
5 {
6 "id": "call_1",
7 "type": "function",
8 "function": {"name": "get_weather", "arguments": "{\"city\": \"Paris\"}"},
9 }
10 ],
11}

After your application executes the tool, send the result back to the model. Resend the conversation with the original user message, the assistant turn that carried the tool call, and a role: "tool" message for each tool_call_id, then call generate_async again:

1messages = [
2 {"role": "user", "content": "What's the weather in Paris?"},
3 {
4 "role": "assistant",
5 "content": None,
6 "tool_calls": [
7 {
8 "id": "call_1",
9 "type": "function",
10 "function": {"name": "get_weather", "arguments": "{\"city\": \"Paris\"}"},
11 }
12 ],
13 },
14 {
15 "role": "tool",
16 "tool_call_id": "call_1",
17 "name": "get_weather",
18 "content": "{\"temperature_c\": 18, \"condition\": \"cloudy\"}",
19 },
20]
21
22response = await rails.generate_async(
23 messages=messages,
24 options={"llm_params": {"tools": tools}},
25)

The tool result validation rail runs on this request before the model sees the result, and the model then returns a final text answer or more tool calls.

You can also declare tools statically on the model in config.yml. IORails merges these into the set of tools the rails validate against, so the rails honor a config-declared tool even when a request carries no llm_params:

1models:
2 - type: main
3 engine: nim
4 model: meta/llama-3.3-70b-instruct
5 parameters:
6 tools:
7 - type: function
8 function:
9 name: get_weather
10 description: Get the weather for a city.
11 parameters:
12 type: object
13 properties:
14 city:
15 type: string
16 required:
17 - city

Tool Call Validation

The tool call validation rail inspects every tool call in the model’s response and blocks the request on the first violation.

  • Allowlist. The call must name a tool you declared. IORails blocks a call to a tool that is not in the declared set, for example tool call 'delete_database' is not an allowed tool.
  • Argument schema. The call’s arguments must validate against the tool’s declared JSON Schema (the function.parameters block). IORails blocks arguments that violate the schema, for example arguments for tool 'get_weather' do not match its schema: 'city' is a required property.
  • No-argument tools. A function tool that declares no parameters must receive no arguments. IORails blocks any supplied argument.
  • Hosted tools. IORails allowlists a hosted or server-side tool that only type identifies (for example a built-in web search) by type, and does not schema-validate its arguments because the provider owns the call shape.

If the declared schema itself is not valid JSON Schema, the rail blocks the request rather than letting an unvalidated call through.

Tool Result Validation

The tool result validation rail checks the tool results your application sends back to the model. Because the OpenAI Chat Completions API is stateless, your application resends the conversation history, including the assistant turn that made the tool calls and the role: "tool" messages that answer them. The rail validates each turn’s results against that same turn’s calls.

The rail blocks the request when a tool result:

  • Is missing a tool_call_id, or that id does not correspond to a tool call earlier in the conversation.
  • Reuses a tool_call_id that another result in the same turn already used. Each call must have exactly one result.
  • Names a different tool than the call it links to, or omits a name when the call’s tool name is known.
  • Carries content that is not a string or a list of content-block objects.

This rail validates structural well-formedness only. It confirms that the results are internally consistent with the calls in the request. It does not yet enforce a declared response schema, and it does not run a content-safety check on the tool result. Because the client resends the conversation, the linkage check verifies intra-request consistency, not server-verified provenance.

Control Rails per Request

Use options.rails to enable or disable each tool rail for a single request. Each toggle accepts true, false, or a list of flow names, and defaults to true:

1response = await rails.generate_async(
2 messages=messages,
3 options={
4 "llm_params": {"tools": tools},
5 "rails": {"tool_output": True, "tool_input": False},
6 },
7)

Streaming

Tool-calling rails work with stream_async. IORails streams the text response as it arrives, accumulates the tool-call fragments, and runs tool call validation once the stream completes:

  • On a clean stream, IORails emits the assembled tool calls as a final chunk after the text and output rails finish.
  • When the rail blocks, IORails emits a guardrails_violation error payload and suppresses the tool-call chunk, so a consumer never receives a tool call after a block.
  • A stream that contains only tool calls and no text skips the output rails.

Limitations

  • Wire format. The tool-calling rails support only the OpenAI Chat Completions shape, through the openai and nim engines. They do not yet support the OpenAI Responses API, Anthropic, Gemini, or Bedrock.
  • Structural result validation. Tool result validation checks structure and linkage only. It does not enforce response schemas or apply content-safety checks to tool results.
  • Intra-request consistency. The linkage check verifies consistency within a single request, not cross-turn provenance.
  • Schema dialect. Argument validation uses JSON Schema. Tool schemas written in a different dialect are not validated against that dialect.
  • Engine requirement. Tool-calling rails require the IORails engine and a Colang 1.0 configuration.

The NVIDIA NeMo Guardrails library includes other tool-related capabilities that are distinct from the IORails tool-calling rails described here:

  • Tools Integration covers LangChain tool passthrough and output-rail validation on the LLMRails engine.
  • Rail types describes Colang execution rails, which run actions before and after execution within the LLMRails event-driven pipeline.

The rails on this page operate at the request and response boundary of the IORails engine and do not execute tools.