Use Policy Advisor
Policy advisor lets a running sandboxed agent ask for a narrow network policy change after OpenShell denies a request. The agent submits a draft through policy.local, a developer approves or rejects it from outside the sandbox, and approved network policy hot-reloads into the same sandbox.
Policy advisor preserves OpenShell’s default-deny posture. The structured rule is the approval contract, and the agent’s rationale is supporting context. By default every proposal lands in the draft inbox for human review. Opt-in auto mode lets the gateway approve provably safe proposals — those whose prover delta is empty — without a reviewer in the loop; proposals with any prover finding still require human approval.
Enable Policy Advisor
Policy advisor is disabled by default. Enable it globally when you want every sandbox on the selected gateway to expose the agent proposal surface:
You can also enable it for one sandbox, unless the key is managed globally:
Check the effective setting for a sandbox:
The output shows whether agent_policy_proposals_enabled is global, sandbox, or unset. A global value overrides sandbox-scoped values. To return control to sandbox-scoped settings, delete the global key:
Set the value before creating a sandbox when you want the first denied request to include policy advisor guidance. Running sandboxes poll settings and can enable the surface after startup, but startup enablement gives the agent the clearest first-denial path.
Approval Modes
Every proposal — mechanistic or agent-authored — is routed through the policy prover. The proposal_approval_mode setting decides what happens when the prover finds nothing to flag.
manual is the default. Auto mode is an explicit opt-in; OpenShell’s default-deny posture is preserved unless you choose otherwise.
Enable auto mode at gateway scope when you want every sandbox on this gateway to auto-approve safe proposals:
Enable it for one sandbox when no global value is set:
The shorthand at create time writes the sandbox-scoped setting for you:
Only manual and auto are accepted; typos like autom are rejected at configure time. Stale or unknown values found in storage are still treated as manual at runtime as a defense-in-depth measure.
Precedence. Gateway scope wins over sandbox scope. A reviewer can pin manual for a fleet by setting it globally; per-sandbox overrides only apply when no global value is set.
Audit trail. Every auto-approval emits a CONFIG:APPROVED event with auto=true, source=<mechanistic|agent_authored>, prover_delta=empty, and resolved_from=<gateway|sandbox|default> so operators can reconstruct why a given approval ran without human review.
How It Works
When policy advisor is enabled, the sandbox supervisor turns on three agent-facing surfaces:
- It installs
/etc/openshell/skills/policy_advisor.mdinside the sandbox. - It also installs
/etc/openshell/skills/policy-advisor/SKILL.mdas a short Codex/generic-agent pointer, and writes a root/AGENTS.mdpointer only when the image does not already provide one. - It serves
http://policy.localfrom inside the sandbox. - It adds
agent_guidanceandnext_stepsto L7policy_deniedresponse bodies so the agent can find the skill and local API.
The loop has seven steps:
- A sandboxed process attempts a network request that policy denies.
- For inspected REST traffic, OpenShell returns a structured
403body with fields such aslayer,host,port,binary,method,path,rule_missing,agent_guidance, andnext_steps. - The agent reads the policy advisor skill, inspects the current policy, and optionally reads recent denial log lines.
- The agent submits one or more
addRuleproposals tohttp://policy.local/v1/proposals. - The gateway stores accepted proposals as pending draft chunks for the sandbox and runs the policy prover against the proposed delta.
- Under
automode, proposals with an empty prover delta are approved immediately and skipped past human review. Undermanualmode (the default), every proposal — and underautomode, every proposal with a prover finding — lands in the draft inbox for a developer to approve or reject. - The agent waits on
/v1/proposals/{chunk_id}/waituntil a decision is available. Approved proposals hot-reload into the sandbox; rejected proposals returnrejection_reasonandvalidation_resultso the agent can revise.
When a proposal is approved, /wait reports policy_reloaded: true only after the local sandbox policy covers the approved rule. At that point the agent can retry the original denied action once. If a proposal is rejected, /wait returns rejection_reason and validation_result so the agent can revise or stop. validation_result carries the categorical prover findings — link_local_reach, l7_bypass_credentialed, credential_reach_expansion, capability_expansion — so the agent can narrow the next attempt to the specific concern the prover flagged.
What Gets Proposed
OpenShell has two proposal paths:
For REST APIs, prefer L7 rules over broad L4 access. A good proposal allows one method and the smallest safe path:
The current policy.local JSON shape covers L4 endpoints and REST method or path rules. Use Customize Sandbox Policies or Policy Schema Reference for policy fields that are not part of the agent-authored proposal surface, such as WebSocket credential rewrite, GraphQL operation matching, endpoint path scoping, and provider-owned policy bundles.
Policy advisor proposals do not add allowed_ips automatically. If an advisor-proposed hostname resolves to an internal or private address, OpenShell’s SSRF protections still block the connection until a developer explicitly adds the required allowed_ips entry. Exact hostname trust for user-declared policy endpoints does not apply to advisor-generated proposal binaries.
What Auto-Approval Checks
The policy prover runs against every proposal — mechanistic and agent-authored alike — and asks four formal questions about the proposed change. Each “yes” is one categorical finding. Any finding blocks auto-approval; only an empty delta is eligible.
Findings are categorical — there is no severity tier. The reviewer reads the category and the structured evidence to decide. When the prover delta is empty, the proposal is provably safe under the model and auto-approval (if enabled) can fire.
The full reasoning model lives in crates/openshell-prover/README.md. Provider profiles composed in via Providers v2 are part of the effective policy the prover reasons over.
Review Proposals
Review pending chunks from the host:
Under auto mode, only proposals the prover flagged appear here; empty-delta proposals are already approved and visible under --status approved with the auto-approval audit fields described in Approval Modes. Under manual mode, every proposal — regardless of prover verdict — shows up as pending.
The output shows the chunk ID, status, rationale, binary, and endpoint summary. For L7 proposals, the endpoint summary includes the protocol, method, and path:
Approve only when the structured rule matches the access you intend to grant:
Reject with guidance when the rule is too broad or points at the wrong target:
The rejection reason is returned to the agent through policy.local. The agent can use it to draft a narrower proposal.
Agent API
policy.local is available only inside the sandbox and uses plain HTTP:
If policy advisor is disabled, every route returns 404 feature_disabled, the skill is not installed for new sandboxes, and L7 deny bodies do not advertise policy.local routes or include agent_guidance.
What to Expect
Approved network rules hot-reload without restarting the sandbox. HTTP L7 keep-alive connections are closed at the reload boundary so the next parsed request uses the new policy. Raw streams remain connection-scoped, as described in Customize Sandbox Policies.
Policy advisor emits audit events into the sandbox log. Use these lines to trace the full loop:
Look for HTTP:* DENIED, CONFIG:PROPOSED, CONFIG:APPROVED or CONFIG:REJECTED, CONFIG:LOADED, and the final allowed request if the agent retries successfully. Auto-approved chunks emit CONFIG:APPROVED with auto=true, source=<mechanistic|agent_authored>, prover_delta=empty, and resolved_from=<gateway|sandbox|default>.
Next Steps
- Use Customize Sandbox Policies for manual policy updates and L7 rule syntax.
- Use Policy Schema Reference for full YAML field details.
- Use Logging to interpret OCSF shorthand log entries.