nemoguardrails.actions.llm.utils | NVIDIA NeMo Guardrails Library Developer Guide

Module Contents

Functions

Name	Description
`_ensure_chat_messages`	-
`_extract_and_remove_think_tags`	Extract reasoning from <think> tags and remove them from `response.content`.
`_extract_chunk_metadata`	-
`_extract_content`	Extract text content from response.
`_extract_user_text_from_event`	Flatten a multimodal user-message payload into a string for colang history.
`_has_unclosed_quote`	Check if a string has an unclosed double quote (ignoring escaped quotes).
`_log_completion`	-
`_log_prompt`	Log the prompt to LLM call info.
`_raise_llm_call_exception`	-
`_setup_llm_call_info`	Initialize or update LLM call info in context.
`_store_reasoning_traces`	-
`_store_response_metadata`	-
`_store_tool_calls`	-
`_stream_llm_call`	-
`_update_token_stats`	-
`_update_token_stats_from_chunk`	-
`escape_flow_name`	Escape invalid keywords in flow names.
`events_to_dialog_history`	Create the dialog history based on provided events.
`extract_bot_thinking_from_events`	-
`extract_tool_calls_from_events`	Extract tool_calls from runtime events.
`flow_to_colang`	Converts a flow to colang format.
`from_log_event_to_identifier`	convert log message to prompt interaction identifier.
`get_and_clear_reasoning_trace_contextvar`	Get the current reasoning trace and clear it from the context.
`get_and_clear_response_metadata_contextvar`	Get the current response metadata and clear it from the context.
`get_and_clear_tool_calls_contextvar`	Get the current tool calls and clear them from the context.
`get_colang_history`	Creates a history of user messages and bot responses in colang format.
`get_first_bot_action`	Returns first bot action.
`get_first_bot_intent`	Returns first bot intent.
`get_first_nonempty_line`	Helper that returns the first non-empty line from a string
`get_first_user_intent`	Returns first user intent.
`get_initial_actions`	Returns the first action before an empty line.
`get_last_bot_intent_event`	Returns the last user intent from the events.
`get_last_bot_utterance_event`	Returns the last bot utterance from the events.
`get_last_user_intent_event`	Returns the last user intent from the events.
`get_last_user_utterance`	Returns the last user utterance from the events.
`get_last_user_utterance_event`	Returns the last user utterance from the events.
`get_last_user_utterance_event_v2_x`	Returns the last user utterance from the events.
`get_multiline_response`	Helper that extracts multi-line responses from the LLM.
`get_retrieved_relevant_chunks`	Returns the retrieved chunks for current user utterance from the events.
`get_top_k_nonempty_lines`	Helper that returns a list with the top k non-empty lines from a string.
`llm_call`	-
`remove_action_intent_identifiers`	Removes the action/intent identifiers.
`remove_text_messages_from_history`	Helper that given a history in colang format, removes all texts.
`strip_quotes`	Helper that removes quotes from a string if the entire string is between quotes
`warn_if_truncated`	Return True and emit a warning if the LLM produced no visible content because it hit the max_tokens budget.

Data

_MAX_QUOTE_CONTINUATION_LINES

logger

API

nemoguardrails.actions.llm.utils._ensure_chat_messages(
    prompt: typing.Union[str, list]
) -> typing.Union[str, typing.List[nemoguardrails.types.ChatMessage]]

nemoguardrails.actions.llm.utils._extract_and_remove_think_tags(
    response: nemoguardrails.types.LLMResponse
) -> typing.Optional[str]

Extract reasoning from <think> tags and remove them from response.content.

This function looks for <think>…</think> tags in the response content, and if found, extracts the reasoning content inside the tags. It has a side-effect: it removes the full reasoning trace and tags from response.content.

Parameters:

response

LLMResponse

The LLM response object

Returns: Optional[str]

The extracted reasoning content, or None if no <think> tags found

nemoguardrails.actions.llm.utils._extract_chunk_metadata(
    chunk: nemoguardrails.types.LLMResponseChunk
) -> typing.Optional[typing.Dict[str, typing.Any]]

nemoguardrails.actions.llm.utils._extract_content(
    response: nemoguardrails.types.LLMResponse
) -> str

Extract text content from response.

nemoguardrails.actions.llm.utils._extract_user_text_from_event(
    event_text: typing.Union[str, typing.List[typing.Dict[str, typing.Any]]]
) -> str

Flatten a multimodal user-message payload into a string for colang history.

Multimodal user events carry event_text as a list of OpenAI-style content parts ([{"type": "text", "text": "..."}, {"type": "image_url", "image_url": {...}}, ...]). Including the full list in the colang history bloats the context with raw base64 data; this helper extracts the visible text parts and appends a [+ image] marker when one or more image parts were present.

Non-string text fields (None or other types) inside a content part are skipped so the " ".join(...) step cannot crash. If the message is image-only, the result is just "[+ image]" without a leading space.

Parameters:

event_text

Union[str, List[Dict[str, Any]]]

Either a string (already flat) or a list of multimodal content parts.

Returns: str

The flattened text. A list input always produces a string; a string

nemoguardrails.actions.llm.utils._has_unclosed_quote(
    s: str
) -> bool

Check if a string has an unclosed double quote (ignoring escaped quotes).

nemoguardrails.actions.llm.utils._log_completion(
    response: nemoguardrails.types.LLMResponse
) -> None

nemoguardrails.actions.llm.utils._log_prompt(
    prompt: typing.Union[str, typing.List[dict]]
) -> None

Log the prompt to LLM call info.

nemoguardrails.actions.llm.utils._raise_llm_call_exception(
    exception: Exception,
    model: nemoguardrails.types.LLMModel
) -> typing.NoReturn

nemoguardrails.actions.llm.utils._setup_llm_call_info(
    model: nemoguardrails.types.LLMModel,
    model_name: typing.Optional[str],
    model_provider: typing.Optional[str]
) -> None

Initialize or update LLM call info in context.

nemoguardrails.actions.llm.utils._store_reasoning_traces(
    response: nemoguardrails.types.LLMResponse
) -> None

nemoguardrails.actions.llm.utils._store_response_metadata(
    response: nemoguardrails.types.LLMResponse
) -> None

nemoguardrails.actions.llm.utils._store_tool_calls(
    response: nemoguardrails.types.LLMResponse
) -> None

nemoguardrails.actions.llm.utils._stream_llm_call(
    model: nemoguardrails.types.LLMModel,
    prompt: typing.Union[str, typing.List[nemoguardrails.types.ChatMessage]],
    handler: nemoguardrails.streaming.StreamingHandler,
    stop: typing.Optional[typing.List[str]],
    llm_params: typing.Optional[dict] = None
) -> nemoguardrails.types.LLMResponse

async

nemoguardrails.actions.llm.utils._update_token_stats(
    response: nemoguardrails.types.LLMResponse
) -> None

nemoguardrails.actions.llm.utils._update_token_stats_from_chunk(
    chunk: nemoguardrails.types.LLMResponseChunk
) -> None

nemoguardrails.actions.llm.utils.escape_flow_name(
    name: str
) -> str

Escape invalid keywords in flow names.

nemoguardrails.actions.llm.utils.events_to_dialog_history(
    events: typing.List[nemoguardrails.colang.v2_x.runtime.flows.InternalEvent]
) -> str

Create the dialog history based on provided events.

nemoguardrails.actions.llm.utils.extract_bot_thinking_from_events(
    events: list
)

nemoguardrails.actions.llm.utils.extract_tool_calls_from_events(
    events: list
) -> typing.Optional[list]

Extract tool_calls from runtime events.

StartToolCallBotAction carries the tool calls that passed tool-output rails and should be returned to the caller. BotToolCalls is used as a fallback for paths that do not emit the post-rail action event.

nemoguardrails.actions.llm.utils.flow_to_colang(
    flow: typing.Union[dict, nemoguardrails.colang.v2_x.lang.colang_ast.Flow]
) -> str

Converts a flow to colang format.

Example flow:

  - user: ask capabilities
  - bot: inform capabilities

to colang:

user ask capabilities
bot inform capabilities

nemoguardrails.actions.llm.utils.from_log_event_to_identifier(
    event_name: str
) -> str

convert log message to prompt interaction identifier.

nemoguardrails.actions.llm.utils.get_and_clear_reasoning_trace_contextvar() -> typing.Optional[str]

Get the current reasoning trace and clear it from the context.

Returns: Optional[str]

Optional[str]: The reasoning trace if one exists, None otherwise.

nemoguardrails.actions.llm.utils.get_and_clear_response_metadata_contextvar() -> typing.Optional[dict]

Get the current response metadata and clear it from the context.

Returns: Optional[dict]

Optional[dict]: The response metadata if it exists, None otherwise.

nemoguardrails.actions.llm.utils.get_and_clear_tool_calls_contextvar() -> typing.Optional[list]

Get the current tool calls and clear them from the context.

Returns: Optional[list]

Optional[list]: The tool calls if they exist, None otherwise.

nemoguardrails.actions.llm.utils.get_colang_history(
    events: typing.List[dict],
    include_texts: bool = True,
    remove_retrieval_events: bool = False
) -> str

Creates a history of user messages and bot responses in colang format. user “Hi, how are you today?” express greeting bot express greeting “Greetings! I am the official NVIDIA Benefits Ambassador AI bot and I’m here to assist you.” user “What can you help me with?” ask capabilities bot inform capabilities “As an AI, I can provide you with a wide range of services, such as …”

nemoguardrails.actions.llm.utils.get_first_bot_action(
    strings: typing.List[str]
) -> typing.Optional[str]

Returns first bot action.

nemoguardrails.actions.llm.utils.get_first_bot_intent(
    strings: typing.List[str]
) -> typing.Optional[str]

Returns first bot intent.

nemoguardrails.actions.llm.utils.get_first_nonempty_line(
    s: str
) -> typing.Optional[str]

Helper that returns the first non-empty line from a string

nemoguardrails.actions.llm.utils.get_first_user_intent(
    strings: typing.List[str]
) -> typing.Optional[str]

Returns first user intent.

nemoguardrails.actions.llm.utils.get_initial_actions(
    strings: typing.List[str]
) -> typing.List[str]

Returns the first action before an empty line.

nemoguardrails.actions.llm.utils.get_last_bot_intent_event(
    events: typing.List[dict]
) -> typing.Optional[dict]

Returns the last user intent from the events.

nemoguardrails.actions.llm.utils.get_last_bot_utterance_event(
    events: typing.List[dict]
) -> typing.Optional[dict]

Returns the last bot utterance from the events.

nemoguardrails.actions.llm.utils.get_last_user_intent_event(
    events: typing.List[dict]
) -> typing.Optional[dict]

Returns the last user intent from the events.

nemoguardrails.actions.llm.utils.get_last_user_utterance(
    events: typing.List[dict]
) -> typing.Optional[str]

Returns the last user utterance from the events.

nemoguardrails.actions.llm.utils.get_last_user_utterance_event(
    events: typing.List[dict]
) -> typing.Optional[dict]

Returns the last user utterance from the events.

nemoguardrails.actions.llm.utils.get_last_user_utterance_event_v2_x(
    events: typing.List[dict]
) -> typing.Optional[dict]

Returns the last user utterance from the events.

nemoguardrails.actions.llm.utils.get_multiline_response(
    s: str
) -> str

Helper that extracts multi-line responses from the LLM. Stopping conditions: when a non-empty line ends with a quote or when the token “user” appears after a newline. Empty lines at the begging of the string are skipped.

nemoguardrails.actions.llm.utils.get_retrieved_relevant_chunks(
    events: typing.List[dict],
    skip_user_message: typing.Optional[bool] = False
) -> typing.Optional[str]

Returns the retrieved chunks for current user utterance from the events.

nemoguardrails.actions.llm.utils.get_top_k_nonempty_lines(
    s: str,
    k: int = 1
) -> typing.Optional[typing.List[str]]

Helper that returns a list with the top k non-empty lines from a string.

If there are less than k non-empty lines, it returns a smaller number of lines.

nemoguardrails.actions.llm.utils.llm_call(
    llm: typing.Optional[typing.Any],
    prompt: typing.Union[str, typing.List[dict]],
    model_name: typing.Optional[str] = None,
    model_provider: typing.Optional[str] = None,
    stop: typing.Optional[typing.List[str]] = None,
    llm_params: typing.Optional[dict] = None,
    streaming_handler: typing.Optional[nemoguardrails.streaming.StreamingHandler] = None
) -> nemoguardrails.types.LLMResponse

async

nemoguardrails.actions.llm.utils.remove_action_intent_identifiers(
    lines: typing.List[str]
) -> typing.List[str]

Removes the action/intent identifiers.

nemoguardrails.actions.llm.utils.remove_text_messages_from_history(
    history: str
) -> str

Helper that given a history in colang format, removes all texts.

nemoguardrails.actions.llm.utils.strip_quotes(
    s: str
) -> str

Helper that removes quotes from a string if the entire string is between quotes

nemoguardrails.actions.llm.utils.warn_if_truncated(
    response: nemoguardrails.types.LLMResponse,
    task: str
) -> bool

Return True and emit a warning if the LLM produced no visible content because it hit the max_tokens budget.

Reasoning models (OpenAI o-series, gpt-5, DeepSeek-R1, Gemini 2.5, Qwen QwQ, etc.) spend output tokens on internal reasoning before emitting visible text. A small max_tokens budget can be fully consumed by the reasoning phase, leaving empty content and finish_reason=“length”. The call succeeds silently and callers that only inspect response.content see nothing. Callers whose downstream parser does not fail safely on empty input (e.g. self_check_facts, whose parser inverts the result) should use the return value to take an explicit fail-safe branch.

nemoguardrails.actions.llm.utils._MAX_QUOTE_CONTINUATION_LINES = 50

nemoguardrails.actions.llm.utils.logger = logging.getLogger(__name__)