Skip to content

Sandbox-spec

A Sandbox-spec is a named template.

It binds one Policy (the rules) to one Runtime (the recipe of Providers). Operators mint Bootstrap URLs from a Sandbox-spec; each bootstrap produces one Sandbox instance.

The unit operators reuse: "customer-support-emailer" is a Sandbox-spec. The fifty actual sandboxes provisioned from it across teams are instances.

Bindsone Policy + one Runtime + a few sandbox-level parameters
Authored byoperator (tappass sandbox-spec create)
ProducesBootstrap URLs that host owners consume to provision Sandboxes
Reusedacross many sandboxes — the template is the unit of standardization
# tappass sandbox-spec create produces this artifact
sandbox_spec:
name: collibra-steward
description: "Catalog modifications via Collibra MCP"
policy: policies/collibra-steward.rego # ref to Policy
runtime: claude-code-laptop # ref to Runtime
# (= claude-code + nono + monty + anthropic-gateway + mcp-broker)
parameters:
workspace_path: "$REPO_ROOT"
trust_tier: worker
forbidden_capabilities: [code_execution, sor_arbitrary_write]
enabled_rings:
- harness
- kernel
- interpreter
enabled_cross_cutting:
- llm_gateway
- mcp_broker
pre_deployment_eval:
required: true # gate emit-bootstrap on eval pass
probe_suites: [owasp-llm-top-10, eu-ai-act]
Sandbox-specSandbox
What it isTemplate (named binding of Policy + Runtime)One running instance
PerOrg / ProjectOne agent on one host
LifecycleAuthored once, reused N timesCreated per tappass-host init; torn down per stop/revoke
Has identity(just a name)Yes — sandbox_id, signed tokens, audit stream
MutableOperator edits → next bootstrap uses new versionImmutable identity; Compiled Policy updates via Sync

One spec produces many Sandboxes. Editing the spec doesn't retroactively change running Sandboxes — they continue under their original Compiled Policy until the next sync push.

TapPass ships curated sandbox-specs for common agent shapes:

Sandbox-specAgent shapeStatus
customer-support-emailergmail.send-driven support repliesplanned
refund-processorStripe partial refunds with approvalplanned
code-reviewerGitHub PR comments via gh CLIplanned
data-engineer-agentSQL against analytics DBplanned
internal-kb-assistantRAG over internal docsplanned
collibra-stewardCatalog modifications (the demo agent)concept
custom (always available)Operator authors ring activation manuallyn/a

A quick-start sandbox-spec selects ring activation, default capability scoping, and (often) a paired Policy template — so applying one is one click.

[create] Operator: tappass sandbox-spec create --name X --policy Y --runtime Z
→ spec saved (template, no instances yet)
[evaluate] (optional, configurable) Operator: tappass eval run --sandbox-spec X
→ probe suite runs against the candidate Compiled Policy
→ fail blocks emit-bootstrap if `pre_deployment_eval.required: true`
[mint-bootstrap] Operator: tappass sandbox-spec emit-bootstrap X --count N
→ N single-use bootstrap URLs, 15-min TTL each
▼ (operator hands URL to host owner)
[host enrolls] Host owner: tappass-host init <name> --enroll-url <url>
→ new Sandbox row created with a fresh sandbox_id
→ Policy compiled into Compiled Policy v1 for this sandbox
→ Runtime providers registered
→ mTLS exchange completes
→ Bootstrap URL burned
[sandbox runs] Per the Runtime's recipe of Providers; Compiled Policy live-pushed via Sync
▼ (when spec evolves)
[edit] Operator: tappass sandbox-spec edit X
→ New version of spec saved
→ Existing Sandboxes from older spec keep running until rebooted or rebound
→ New bootstraps from this spec use the new version

Forbidden capabilities — the absolute floor

Section titled “Forbidden capabilities — the absolute floor”

A sandbox-spec can declare forbidden_capabilities:

forbidden_capabilities: [code_execution, sor_arbitrary_write]

These cannot be lifted by any Cascade override. Lifting them requires editing the spec itself — which is a separately-audited Compliance action. This is what makes sandbox-specs safe by default: even if a downstream operator tries to override, the floor stays.

EngineWhat it doesStatus
Spec authoring CLItappass sandbox-spec create / edit / listconcept (within tappass-cli)
Spec evaluatorRuns probe suite against the bound Policy + Runtimeconcept (Q4 2026)
Bootstrap minterIssues N single-use URLs from a specconcept (within tappass-cli)
Quick-start libraryCurated specs for common agent shapesconcept (Q4 2026, demand-driven)
Spec versionerDiff between versions; rollbackconcept (Q1 2027)
PersonaSurfaceWhat you do
Operator (CLI)tappass sandbox-spec create / edit / list / show / emit-bootstrapauthor and mint specs
Operator (dashboard)"Sandbox templates" pagevisual equivalent of CLI
ComplianceSpec audit viewinspect which specs apply which packs
TopicFile
Visiongoverned-agents.md §11 — enrollment flow
Quick-start librarySandbox card §Quick-starts
Operator CLI surfaceoperator-cli
  • Sandbox-spec ≠ Sandbox. A spec is a template; a Sandbox is a running instance produced from one. One spec → many Sandboxes.
  • Sandbox-spec ≠ Compiled Policy. A spec binds a Policy; the Compiled Policy is what comes out of the policy compiler when the spec is applied to a particular Sandbox identity. Different Sandboxes from the same spec get different Compiled Policies (different identity, different cascade context).
  • Editing a spec doesn't retroactively change running Sandboxes. Existing Sandboxes keep their Compiled Policy until rebound. New bootstraps from the spec use the latest version. (To force-update existing sandboxes, the operator runs tappass sandbox rebind or rotates the spec via Sync.)