Agent Blueprint Patterns#

Blueprints are repeatable workflow templates that ride on top of the workspace. Each blueprint is explicitly configured with its goal, required tools, allowed services, data scope, write permissions, review gate and logging expectations.

LLMs are nondeterministic and can be jailbroken or hijacked by untrusted input. A prompt is therefore not a security boundary: autonomous behavior must be constrained by the environment and the release controls around the prompt / skill, not by assuming the instructions themselves are sufficient.

Posture Rules#

  • Start narrow. Default tools are least-privilege - read-only scopes where the service supports them; no write-capable tools attached unless the blueprint explicitly requires one. Direct writes are the exception.

  • Writes to systems of record require explicit human review and approval. Logged as OCSF events to enterprise SIEM.

  • Static policy. Allowed services, data scopes, and write boundaries are declared up front in the signed policy bundle. Neither agent nor user widens scope at runtime; changes require a bundle re-issue.

  • Defense in depth. Hard (deterministic) controls form the security boundary — workspace perimeter (network allowlist, per-service authentication, human review of writes) and runtime sandbox (kernel-level enforcement, credential proxy, in-runtime egress). Soft controls (LLM-as-judge, prompt hardening) may complement but never replace the hard layer. The agent cannot reach services or data outside its declared scope, regardless of agent prompt.

  • Lifecycle ownership. Each blueprint has an owner, a review cadence, an incident contact, and a deprecation path — managed as versioned products, not ad-hoc scripts.

  • Always-on by default. Blueprints assume the workspace stays running while the user is offline; the human-review gate is asynchronous, not in-session.

Writes are conservatively gated for three convergent reasons, not one:

  • Last line of defense. The exfiltration path runs through two channels: outbound traffic and writes. Outbound traffic — including parameter smuggling under prompt injection — is constrained upstream by the network allowlist and credential proxy. Writes are constrained by the human-review gate, the last line of defense after the upstream controls (sandbox, allowlist, signed policy, credential proxy). Writes are also the hardest to roll back: reads are repeatable; approved writes often aren’t (un-merging a commit, retracting a published doc).

  • Accountability and provenance. The reviewer becomes the principal of record/point of accountability for the action; the four-identity attribution chain bottoms out at a real person, not at “the agent did it.” Human-attributed writes stay distinguishable from agent-generated writes — preserving data lineage for future input-trust policies. For autonomous or service-account agents with no human reviewer, accountability resolves to the agent’s registered owner / sponsor, not to the agent itself.

  • Deployable under stricter governance. Some environments require human sign-off on consequential actions as an external mandate rather than an architectural choice. The conservative write posture lets the same blueprint catalog deploy unchanged whether or not such a mandate applies — without claiming conformance to any specific regime.

Any one of these layers would justify the posture. Together they apply the same defense-in-depth logic the architecture uses for network reach, identity, and credential issuance — to writes.

Realization pattern. Agents write to a private staging surface and propose the change for human approval rather than committing directly to a shared system of record — a branch + MR/PR for code, a draft or unpublished revision for documents, a proposed state change for tickets. The staged artifact is the human-review gate; agents never write directly to main or other shared sources of truth.

Table 4: Agent Blueprint Patterns

Blueprint

Goal

Typical Services

Permissions Gate

Coding assistant

Read code, explain, draft patches, run tests

Source control, package repos, CI logs, ticketing

Feature / single-owner branch commits automatable; review before protected-branch merge or write

Documentation assistant

Search docs, summarize systems, draft updates

Docs system, enterprise search, source-controlled docs

Review before publish

Issue triage assistant

Analyze tickets, cluster themes, propose actions

Ticketing, docs, source control, logs

Review before state change or assignment

Developer onboarding assistant

Help new engineers understand repos and workflows

Source control, docs, chat history, package repos

Read-only by default; reviewed drafts only

Research notebook assistant

Run notebooks, inspect data, draft analysis

Notebook environment, data store, package repos

Review before publishing or writing shared data

Operations assistant

Summarize alerts, correlate logs, draft remediation

Logs, tickets, runbooks, chat

Review before executing remediation

Knowledge worker assistant

Search mail, chat, docs, tickets; summarize context

Mailbox, chat, docs, ticketing

Review before send, forward, publish, or delete

Note on reversibility. The review gate scales on two axes: blast radius (who the write affects) and reversibility (how easily it can be undone). Writes to personal or ephemeral targets — single-owner branches, scratch space, personal notes, workspace-local drafts — are self-limiting and may be automated without review. Writes to shared sources of truth — protected branches, published docs, tickets, shared data — affect other workflows and require human review; the gate tightens further when the shared write is also irreversible (protected-branch merge, send, publish, delete, state change). Reversible writes to shared-but-versioned targets (e.g. a wiki page with history) may be allowed directly, with revert as the safety net and the outcome as the user’s responsibility. Where the line falls is the enterprise’s policy call.

Note on code execution. Any blueprint that runs code — coding, research notebook, operations — executes it inside the runtime sandbox (OpenShell; per-session or per-call ephemeral, see §4 Three Timescales of Sandbox Lifecycle), never on the host or the endpoint.