Sandbox-spec
Sandbox-spec
Section titled “Sandbox-spec”A Sandbox-spec is a named template.
It binds one Policy (the rules) to one Runtime (the recipe of Providers). Operators mint Bootstrap URLs from a Sandbox-spec; each bootstrap produces one Sandbox instance.
The unit operators reuse: "customer-support-emailer" is a Sandbox-spec. The fifty actual sandboxes provisioned from it across teams are instances.
At a glance
Section titled “At a glance”| Binds | one Policy + one Runtime + a few sandbox-level parameters |
| Authored by | operator (tappass sandbox-spec create) |
| Produces | Bootstrap URLs that host owners consume to provision Sandboxes |
| Reused | across many sandboxes — the template is the unit of standardization |
What it contains
Section titled “What it contains”# tappass sandbox-spec create produces this artifactsandbox_spec: name: collibra-steward description: "Catalog modifications via Collibra MCP" policy: policies/collibra-steward.rego # ref to Policy runtime: claude-code-laptop # ref to Runtime # (= claude-code + nono + monty + anthropic-gateway + mcp-broker) parameters: workspace_path: "$REPO_ROOT" trust_tier: worker forbidden_capabilities: [code_execution, sor_arbitrary_write] enabled_rings: - harness - kernel - interpreter enabled_cross_cutting: - llm_gateway - mcp_broker pre_deployment_eval: required: true # gate emit-bootstrap on eval pass probe_suites: [owasp-llm-top-10, eu-ai-act]Sandbox-spec vs. Sandbox
Section titled “Sandbox-spec vs. Sandbox”| Sandbox-spec | Sandbox | |
|---|---|---|
| What it is | Template (named binding of Policy + Runtime) | One running instance |
| Per | Org / Project | One agent on one host |
| Lifecycle | Authored once, reused N times | Created per tappass-host init; torn down per stop/revoke |
| Has identity | (just a name) | Yes — sandbox_id, signed tokens, audit stream |
| Mutable | Operator edits → next bootstrap uses new version | Immutable identity; Compiled Policy updates via Sync |
One spec produces many Sandboxes. Editing the spec doesn't retroactively change running Sandboxes — they continue under their original Compiled Policy until the next sync push.
Quick-start sandbox-specs
Section titled “Quick-start sandbox-specs”TapPass ships curated sandbox-specs for common agent shapes:
| Sandbox-spec | Agent shape | Status |
|---|---|---|
customer-support-emailer | gmail.send-driven support replies | planned |
refund-processor | Stripe partial refunds with approval | planned |
code-reviewer | GitHub PR comments via gh CLI | planned |
data-engineer-agent | SQL against analytics DB | planned |
internal-kb-assistant | RAG over internal docs | planned |
collibra-steward | Catalog modifications (the demo agent) | concept |
custom (always available) | Operator authors ring activation manually | n/a |
A quick-start sandbox-spec selects ring activation, default capability scoping, and (often) a paired Policy template — so applying one is one click.
Lifecycle
Section titled “Lifecycle”[create] Operator: tappass sandbox-spec create --name X --policy Y --runtime Z → spec saved (template, no instances yet) │ ▼[evaluate] (optional, configurable) Operator: tappass eval run --sandbox-spec X → probe suite runs against the candidate Compiled Policy → fail blocks emit-bootstrap if `pre_deployment_eval.required: true` │ ▼[mint-bootstrap] Operator: tappass sandbox-spec emit-bootstrap X --count N → N single-use bootstrap URLs, 15-min TTL each │ ▼ (operator hands URL to host owner)[host enrolls] Host owner: tappass-host init <name> --enroll-url <url> → new Sandbox row created with a fresh sandbox_id → Policy compiled into Compiled Policy v1 for this sandbox → Runtime providers registered → mTLS exchange completes → Bootstrap URL burned │ ▼[sandbox runs] Per the Runtime's recipe of Providers; Compiled Policy live-pushed via Sync │ ▼ (when spec evolves)[edit] Operator: tappass sandbox-spec edit X → New version of spec saved → Existing Sandboxes from older spec keep running until rebooted or rebound → New bootstraps from this spec use the new versionForbidden capabilities — the absolute floor
Section titled “Forbidden capabilities — the absolute floor”A sandbox-spec can declare forbidden_capabilities:
forbidden_capabilities: [code_execution, sor_arbitrary_write]These cannot be lifted by any Cascade override. Lifting them requires editing the spec itself — which is a separately-audited Compliance action. This is what makes sandbox-specs safe by default: even if a downstream operator tries to override, the floor stays.
Engines that operate on Sandbox-specs
Section titled “Engines that operate on Sandbox-specs”| Engine | What it does | Status |
|---|---|---|
| Spec authoring CLI | tappass sandbox-spec create / edit / list | concept (within tappass-cli) |
| Spec evaluator | Runs probe suite against the bound Policy + Runtime | concept (Q4 2026) |
| Bootstrap minter | Issues N single-use URLs from a spec | concept (within tappass-cli) |
| Quick-start library | Curated specs for common agent shapes | concept (Q4 2026, demand-driven) |
| Spec versioner | Diff between versions; rollback | concept (Q1 2027) |
Surfaces
Section titled “Surfaces”| Persona | Surface | What you do |
|---|---|---|
| Operator (CLI) | tappass sandbox-spec create / edit / list / show / emit-bootstrap | author and mint specs |
| Operator (dashboard) | "Sandbox templates" page | visual equivalent of CLI |
| Compliance | Spec audit view | inspect which specs apply which packs |
Related concepts
Section titled “Related concepts”- binds → Policy (rules) + Runtime (recipe)
- emits → Bootstrap URLs
- produces → Sandbox instances
- gated by → Probe suites (pre-deployment evaluation)
Authoritative docs
Section titled “Authoritative docs”| Topic | File |
|---|---|
| Vision | governed-agents.md §11 — enrollment flow |
| Quick-start library | Sandbox card §Quick-starts |
| Operator CLI surface | operator-cli |
Common confusions
Section titled “Common confusions”- Sandbox-spec ≠ Sandbox. A spec is a template; a Sandbox is a running instance produced from one. One spec → many Sandboxes.
- Sandbox-spec ≠ Compiled Policy. A spec binds a Policy; the Compiled Policy is what comes out of the policy compiler when the spec is applied to a particular Sandbox identity. Different Sandboxes from the same spec get different Compiled Policies (different identity, different cascade context).
- Editing a spec doesn't retroactively change running Sandboxes. Existing Sandboxes keep their Compiled Policy until rebound. New bootstraps from the spec use the latest version. (To force-update existing sandboxes, the operator runs
tappass sandbox rebindor rotates the spec via Sync.)