Memory in NVIDIA NeMo Agent Toolkit#
The NeMo Agent toolkit Memory subsystem is designed to store and retrieve a user’s conversation history, preferences, and other “long-term memory.” This is especially useful for building stateful LLM-based applications that recall user-specific data or interactions across multiple steps.
The memory module is designed to be extensible, allowing developers to create custom memory back-ends, providers in NeMo Agent toolkit terminology.
Included Memory Modules#
The NeMo Agent toolkit includes three memory module providers, all of which are available as plugins:
Mem0 which is provided by the
nvidia-nat-mem0aiplugin.Redis which is provided by the
nvidia-nat-redisplugin.Zep which is provided by the
nvidia-nat-zep-cloudplugin (Zep NVIDIA NeMo documentation).
Automatic Memory Wrapper Agent#
The NeMo Agent toolkit provides an auto_memory_agent wrapper that adds automatic memory capture and retrieval to any agent without requiring the LLM to invoke memory tools explicitly.
Why Use Automatic Memory?#
Traditional tool-based memory:
LLMs may forget to call memory tools
Memory capture is inconsistent
Requires explicit memory tool configuration
Automatic memory wrapper agent:
Guaranteed capture: User messages and agent responses are automatically stored
Automatic retrieval: Relevant context is injected before each agent call
Memory backend agnostic: Works with Zep, Mem0, Redis, or any
MemoryEditorUniversal compatibility: Wraps any agent type (ReAct, ReWOO, Tool Calling, etc.)
Quick Start#
To use automatic memory, wrap any agent with the auto_memory_agent workflow type:
memory:
zep_memory:
_type: nat.plugins.zep_cloud/zep_memory
functions:
my_react_agent:
_type: react_agent
llm_name: nim_llm
tool_names: [calculator]
workflow:
_type: auto_memory_agent
inner_agent_name: my_react_agent
memory_name: zep_memory
llm_name: nim_llm
Configuration Options#
The automatic memory wrapper agent supports several configuration parameters:
Required Parameters:
inner_agent_name: Name of the agent to wrap with automatic memorymemory_name: Name of the memory backend (frommemory:section)llm_name: LLM to use (required byAgentBaseConfig)
Optional Feature Flags (all default to true):
save_user_messages_to_memory: Automatically save user messages before agent processingretrieve_memory_for_every_response: Automatically retrieve and inject memory contextsave_ai_messages_to_memory: Automatically save agent responses after generation
Memory Backend Parameters:
search_params: Passed tomemory_editor.search()(e.g.,mode,top_k)add_params: Passed tomemory_editor.add_items()(e.g.,ignore_roles)
Multi-Tenant Memory Isolation#
User ID is automatically extracted at runtime for memory isolation via:
user_manager.get_id()- For production with custom auth middleware (recommended)X-User-IDHTTP header - For testing without middleware"default_user"- Fallback for local development
For detailed configuration and usage examples, refer to the examples/agents/auto_memory_wrapper/README.md guide.
Examples#
The following examples in the repository demonstrate how to use the memory module in the NeMo Agent toolkit:
examples/agents/auto_memory_wrapper- Automatic memory wrapper agent for any agentexamples/memory/redis- Basic long-term memory using Redisexamples/frameworks/semantic_kernel_demo- Multi-agent system with long-term memoryexamples/RAG/simple_rag- RAG system with Mem0 memory
Additional Resources#
For information on how to write a new memory module provider can be found in the Adding a Memory Provider document.