Agent Blueprint Patterns#

Blueprints are repeatable workflow templates that ride on top of the workspace. Each blueprint is explicitly configured with its goal, required tools, allowed services, data scope, write permissions, review gate and logging expectations.

LLMs are nondeterministic and can be jailbroken or hijacked by untrusted input. A prompt is therefore not a security boundary: autonomous behavior must be constrained by the environment and the release controls around the prompt / skill, not by assuming the instructions themselves are sufficient.

Posture Rules#

Start narrow. Default tools are least-privilege - read-only scopes where the service supports them; no write-capable tools attached unless the blueprint explicitly requires one. Direct writes are the exception.
Writes to systems of record require explicit human review and approval. Logged as OCSF events to enterprise SIEM.
Static policy. Allowed services, data scopes, and write boundaries are declared up front in the signed policy bundle. Neither agent nor user widens scope at runtime; changes require a bundle re-issue.
Defense in depth. Hard (deterministic) controls form the security boundary — workspace perimeter (network allowlist, per-service authentication, human review of writes) and runtime sandbox (kernel-level enforcement, credential proxy, in-runtime egress). Soft controls (LLM-as-judge, prompt hardening) may complement but never replace the hard layer. The agent cannot reach services or data outside its declared scope, regardless of agent prompt.
Lifecycle ownership. Each blueprint has an owner, a review cadence, an incident contact, and a deprecation path — managed as versioned products, not ad-hoc scripts.
Always-on by default. Blueprints assume the workspace stays running while the user is offline; the human-review gate is asynchronous, not in-session.

Writes are conservatively gated for three convergent reasons, not one:

Last line of defense. The exfiltration path runs through two channels: outbound traffic and writes. Outbound traffic — including parameter smuggling under prompt injection — is constrained upstream by the network allowlist and credential proxy. Writes are constrained by the human-review gate, the last line of defense after the upstream controls (sandbox, allowlist, signed policy, credential proxy). Writes are also the hardest to roll back: reads are repeatable; approved writes often aren’t (un-merging a commit, retracting a published doc).
Accountability and provenance. The reviewer becomes the principal of record/point of accountability for the action; the four-identity attribution chain bottoms out at a real person, not at “the agent did it.” Human-attributed writes stay distinguishable from agent-generated writes — preserving data lineage for future input-trust policies. For autonomous or service-account agents with no human reviewer, accountability resolves to the agent’s registered owner / sponsor, not to the agent itself.
Deployable under stricter governance. Some environments require human sign-off on consequential actions as an external mandate rather than an architectural choice. The conservative write posture lets the same blueprint catalog deploy unchanged whether or not such a mandate applies — without claiming conformance to any specific regime.

Any one of these layers would justify the posture. Together they apply the same defense-in-depth logic the architecture uses for network reach, identity, and credential issuance — to writes.

Realization pattern. Agents write to a private staging surface and propose the change for human approval rather than committing directly to a shared system of record — a branch + MR/PR for code, a draft or unpublished revision for documents, a proposed state change for tickets. The staged artifact is the human-review gate; agents never write directly to main or other shared sources of truth.

Table 4: Agent Blueprint Patterns

Blueprint	Goal	Typical Services	Permissions Gate
Coding assistant	Read code, explain, draft patches, run tests	Source control, package repos, CI logs, ticketing	Feature / single-owner branch commits automatable; review before protected-branch merge or write
Documentation assistant	Search docs, summarize systems, draft updates	Docs system, enterprise search, source-controlled docs	Review before publish
Issue triage assistant	Analyze tickets, cluster themes, propose actions	Ticketing, docs, source control, logs	Review before state change or assignment
Developer onboarding assistant	Help new engineers understand repos and workflows	Source control, docs, chat history, package repos	Read-only by default; reviewed drafts only
Research notebook assistant	Run notebooks, inspect data, draft analysis	Notebook environment, data store, package repos	Review before publishing or writing shared data
Operations assistant	Summarize alerts, correlate logs, draft remediation	Logs, tickets, runbooks, chat	Review before executing remediation
Knowledge worker assistant	Search mail, chat, docs, tickets; summarize context	Mailbox, chat, docs, ticketing	Review before send, forward, publish, or delete

Note on reversibility. The review gate scales on two axes: blast radius (who the write affects) and reversibility (how easily it can be undone). Writes to personal or ephemeral targets — single-owner branches, scratch space, personal notes, workspace-local drafts — are self-limiting and may be automated without review. Writes to shared sources of truth — protected branches, published docs, tickets, shared data — affect other workflows and require human review; the gate tightens further when the shared write is also irreversible (protected-branch merge, send, publish, delete, state change). Reversible writes to shared-but-versioned targets (e.g. a wiki page with history) may be allowed directly, with revert as the safety net and the outcome as the user’s responsibility. Where the line falls is the enterprise’s policy call.

Note on code execution. Any blueprint that runs code — coding, research notebook, operations — executes it inside the runtime sandbox (OpenShell; per-session or per-call ephemeral, see §4 Three Timescales of Sandbox Lifecycle), never on the host or the endpoint.