nat.plugins.strands.llm#

LLM provider wrappers for AWS Strands integration with NVIDIA NeMo Agent toolkit.

This module provides Strands-compatible LLM client wrappers for the following providers:

Supported Providers#

  • OpenAI: Direct OpenAI API integration through OpenAIModelConfig

  • NVIDIA NIM: OpenAI-compatible endpoints for NVIDIA models through NIMModelConfig

  • AWS Bedrock: Amazon Bedrock models (such as Claude) through AWSBedrockModelConfig

Each wrapper:

  • Validates that Responses API features are disabled (Strands manages tool execution)

  • Patches clients with NeMo Agent toolkit retry logic from RetryMixin

  • Injects chain-of-thought prompts when ThinkingMixin is configured

  • Removes NeMo Agent toolkit-specific config keys before instantiating Strands clients

Future Provider Support#

The following providers are not yet supported but could be contributed:

  • Azure OpenAI: Would require a Strands Azure OpenAI client wrapper similar to the existing OpenAI integration. Contributors should follow the pattern established in openai_strands and ensure Azure-specific authentication (endpoint, API version, deployment name) is properly handled.

  • LiteLLM: The wrapper would need to handle LiteLLM’s unified interface across multiple providers while preserving Strands’ tool execution semantics.

See the Strands documentation at https://strandsagents.com for model provider details.

Attributes#

Functions#

_patch_llm_based_on_config(→ ModelType)

Patch a Strands client per NAT config (retries/thinking) and return it.

openai_strands(→ collections.abc.AsyncGenerator[Any, None])

Build a Strands OpenAI client from an NVIDIA NeMo Agent toolkit configuration.

nim_strands(→ collections.abc.AsyncGenerator[Any, None])

Build a Strands OpenAI-compatible client for NVIDIA NIM endpoints.

bedrock_strands(→ collections.abc.AsyncGenerator[Any, ...)

Build a Strands Bedrock client from an NVIDIA NeMo Agent toolkit configuration.

Module Contents#

ModelType#
_patch_llm_based_on_config(
client: ModelType,
llm_config: nat.data_models.llm.LLMBaseConfig,
) ModelType#

Patch a Strands client per NAT config (retries/thinking) and return it.

Args:

client: Concrete Strands model client instance. llm_config: NAT LLM config with Retry/Thinking mixins.

Returns:

The patched client instance.

async openai_strands(
llm_config: nat.llm.openai_llm.OpenAIModelConfig,
_builder: nat.builder.builder.Builder,
) collections.abc.AsyncGenerator[Any, None]#

Build a Strands OpenAI client from an NVIDIA NeMo Agent toolkit configuration.

The wrapper requires the nvidia-nat[strands] extra and a valid OpenAI-compatible API key. When llm_config.api_key is empty, the integration falls back to the OPENAI_API_KEY environment variable. Responses API features are disabled through validate_no_responses_api because Strands handles tool execution inside the framework runtime. The yielded client is patched with NeMo Agent toolkit retry and thinking hooks so that framework-level policies remain consistent.

Args:

llm_config: OpenAI configuration declared in the workflow. _builder: Builder instance provided by the workflow factory (unused).

Yields:

Strands OpenAIModel objects ready to stream responses with NeMo Agent toolkit retry/thinking behaviors applied.

async nim_strands(
llm_config: nat.llm.nim_llm.NIMModelConfig,
_builder: nat.builder.builder.Builder,
) collections.abc.AsyncGenerator[Any, None]#

Build a Strands OpenAI-compatible client for NVIDIA NIM endpoints.

Install the nvidia-nat[strands] extra and provide a NIM API key either through llm_config.api_key or the NVIDIA_API_KEY environment variable. The wrapper uses the OpenAI-compatible Strands client so Strands can route tool calls while the NeMo Agent toolkit continues to manage retries, timeouts, and optional thinking prompts. Responses API options are blocked to avoid conflicting execution models.

Args:

llm_config: Configuration for calling NVIDIA NIM by way of the OpenAI protocol. _builder: Builder instance supplied during workflow construction (unused).

Yields:

Patched Strands clients that stream responses using the NVIDIA NIM endpoint configured in llm_config.

async bedrock_strands(
llm_config: nat.llm.aws_bedrock_llm.AWSBedrockModelConfig,
_builder: nat.builder.builder.Builder,
) collections.abc.AsyncGenerator[Any, None]#

Build a Strands Bedrock client from an NVIDIA NeMo Agent toolkit configuration.

The integration expects the nvidia-nat[strands] extra plus AWS credentials that can be resolved by boto3. Credentials are loaded in the following priority:

  1. Explicit values embedded in the active AWS profile referenced by llm_config.credentials_profile_name.

  2. Standard environment variables such as AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN.

  3. Ambient credentials provided by the compute environment (for example, an IAM role attached to the container or instance).

When llm_config.region_name is "None" or None Strands uses the regional default configured in AWS. Responses API options remain unsupported so that Strands can own tool execution. Retry and thinking hooks are added automatically before the Bedrock client is yielded.

Args:

llm_config: AWS Bedrock configuration saved in the workflow. _builder: Builder reference supplied by the workflow factory (unused).

Yields:

Strands BedrockModel instances configured for the selected Bedrock model_name and patched with NeMo Agent toolkit retry/thinking helpers.