Agent Blueprint Patterns#
Blueprints are repeatable workflow templates that ride on top of the workspace. Each blueprint is explicitly configured with its goal, required tools, allowed services, data scope, write permissions, review gate and logging expectations.
LLMs are nondeterministic and can be jailbroken or hijacked by untrusted input. A prompt is therefore not a security boundary: autonomous behavior must be constrained by the environment and the release controls around the prompt / skill, not by assuming the instructions themselves are sufficient.
Posture Rules#
Start narrow. Default tools are least-privilege - read-only scopes where the service supports them; no write-capable tools attached unless the blueprint explicitly requires one. Direct writes are the exception.
Writes to systems of record require explicit human review and approval. Logged as OCSF events to enterprise SIEM.
Static policy. Allowed services, data scopes, and write boundaries are declared up front in the signed policy bundle. Neither agent nor user widens scope at runtime; changes require a bundle re-issue.
Defense in depth. Hard (deterministic) controls form the security boundary — workspace perimeter (network allowlist, per-service authentication, human review of writes) and runtime sandbox (kernel-level enforcement, credential proxy, in-runtime egress). Soft controls (LLM-as-judge, prompt hardening) may complement but never replace the hard layer. The agent cannot reach services or data outside its declared scope, regardless of agent prompt.
Lifecycle ownership. Each blueprint has an owner, a review cadence, an incident contact, and a deprecation path — managed as versioned products, not ad-hoc scripts.
Always-on by default. Blueprints assume the workspace stays running while the user is offline; the human-review gate is asynchronous, not in-session.
Writes are conservatively gated for three convergent reasons, not one:
Last line of defense. The exfiltration path runs through two channels: outbound traffic and writes. Outbound traffic — including parameter smuggling under prompt injection — is constrained upstream by the network allowlist and credential proxy. Writes are constrained by the human-review gate, the last line of defense after the upstream controls (sandbox, allowlist, signed policy, credential proxy). Writes are also the hardest to roll back: reads are repeatable; approved writes often aren’t (un-merging a commit, retracting a published doc).
Accountability and provenance. The reviewer becomes the principal of record/point of accountability for the action; the four-identity attribution chain bottoms out at a real person, not at “the agent did it.” Human-attributed writes stay distinguishable from agent-generated writes — preserving data lineage for future input-trust policies. For autonomous or service-account agents with no human reviewer, accountability resolves to the agent’s registered owner / sponsor, not to the agent itself.
Deployable under stricter governance. Some environments require human sign-off on consequential actions as an external mandate rather than an architectural choice. The conservative write posture lets the same blueprint catalog deploy unchanged whether or not such a mandate applies — without claiming conformance to any specific regime.
Any one of these layers would justify the posture. Together they apply the same defense-in-depth logic the architecture uses for network reach, identity, and credential issuance — to writes.
Realization pattern. Agents write to a private staging surface and propose the change for human approval rather than committing directly to a shared system of record — a branch + MR/PR for code, a draft or unpublished revision for documents, a proposed state change for tickets. The staged artifact is the human-review gate; agents never write directly to main or other shared sources of truth.
Table 4: Agent Blueprint Patterns
Blueprint |
Goal |
Typical Services |
Permissions Gate |
|---|---|---|---|
Coding assistant |
Read code, explain, draft patches, run tests |
Source control, package repos, CI logs, ticketing |
Feature / single-owner branch commits automatable; review before protected-branch merge or write |
Documentation assistant |
Search docs, summarize systems, draft updates |
Docs system, enterprise search, source-controlled docs |
Review before publish |
Issue triage assistant |
Analyze tickets, cluster themes, propose actions |
Ticketing, docs, source control, logs |
Review before state change or assignment |
Developer onboarding assistant |
Help new engineers understand repos and workflows |
Source control, docs, chat history, package repos |
Read-only by default; reviewed drafts only |
Research notebook assistant |
Run notebooks, inspect data, draft analysis |
Notebook environment, data store, package repos |
Review before publishing or writing shared data |
Operations assistant |
Summarize alerts, correlate logs, draft remediation |
Logs, tickets, runbooks, chat |
Review before executing remediation |
Knowledge worker assistant |
Search mail, chat, docs, tickets; summarize context |
Mailbox, chat, docs, ticketing |
Review before send, forward, publish, or delete |
Note on reversibility. The review gate scales on two axes: blast radius (who the write affects) and reversibility (how easily it can be undone). Writes to personal or ephemeral targets — single-owner branches, scratch space, personal notes, workspace-local drafts — are self-limiting and may be automated without review. Writes to shared sources of truth — protected branches, published docs, tickets, shared data — affect other workflows and require human review; the gate tightens further when the shared write is also irreversible (protected-branch merge, send, publish, delete, state change). Reversible writes to shared-but-versioned targets (e.g. a wiki page with history) may be allowed directly, with revert as the safety net and the outcome as the user’s responsibility. Where the line falls is the enterprise’s policy call.
Note on code execution. Any blueprint that runs code — coding, research notebook, operations — executes it inside the runtime sandbox (OpenShell; per-session or per-call ephemeral, see §4 Three Timescales of Sandbox Lifecycle), never on the host or the endpoint.