Sandbox-spec

A Sandbox-spec is a named template.

It binds one Policy (the rules) to one Runtime (the recipe of Providers). Operators mint Bootstrap URLs from a Sandbox-spec; each bootstrap produces one Sandbox instance.

The unit operators reuse: "customer-support-emailer" is a Sandbox-spec. The fifty actual sandboxes provisioned from it across teams are instances.

At a glance


Binds	one Policy + one Runtime + a few sandbox-level parameters
Authored by	operator (`tappass sandbox-spec create`)
Produces	Bootstrap URLs that host owners consume to provision Sandboxes
Reused	across many sandboxes — the template is the unit of standardization

What it contains

# tappass sandbox-spec create produces this artifact
sandbox_spec:
  name:        collibra-steward
  description: "Catalog modifications via Collibra MCP"
  policy:      policies/collibra-steward.rego        # ref to Policy
  runtime:     claude-code-laptop                    # ref to Runtime
                                                     # (= claude-code + nono + monty + anthropic-gateway + mcp-broker)
  parameters:
    workspace_path: "$REPO_ROOT"
    trust_tier:     worker
    forbidden_capabilities: [code_execution, sor_arbitrary_write]
  enabled_rings:
    - harness
    - kernel
    - interpreter
  enabled_cross_cutting:
    - llm_gateway
    - mcp_broker
  pre_deployment_eval:
    required: true                                   # gate emit-bootstrap on eval pass
    probe_suites: [owasp-llm-top-10, eu-ai-act]

Sandbox-spec vs. Sandbox

	Sandbox-spec	Sandbox
What it is	Template (named binding of Policy + Runtime)	One running instance
Per	Org / Project	One agent on one host
Lifecycle	Authored once, reused N times	Created per `tappass-host init`; torn down per stop/revoke
Has identity	(just a name)	Yes — `sandbox_id`, signed tokens, audit stream
Mutable	Operator edits → next bootstrap uses new version	Immutable identity; Compiled Policy updates via Sync

One spec produces many Sandboxes. Editing the spec doesn't retroactively change running Sandboxes — they continue under their original Compiled Policy until the next sync push.

Quick-start sandbox-specs

TapPass ships curated sandbox-specs for common agent shapes:

Sandbox-spec	Agent shape	Status
`customer-support-emailer`	gmail.send-driven support replies	planned
`refund-processor`	Stripe partial refunds with approval	planned
`code-reviewer`	GitHub PR comments via gh CLI	planned
`data-engineer-agent`	SQL against analytics DB	planned
`internal-kb-assistant`	RAG over internal docs	planned
`collibra-steward`	Catalog modifications (the demo agent)	concept
`custom` (always available)	Operator authors ring activation manually	n/a

A quick-start sandbox-spec selects ring activation, default capability scoping, and (often) a paired Policy template — so applying one is one click.

Lifecycle

[create]           Operator: tappass sandbox-spec create --name X --policy Y --runtime Z
                   → spec saved (template, no instances yet)
   │
   ▼
[evaluate]         (optional, configurable) Operator: tappass eval run --sandbox-spec X
                   → probe suite runs against the candidate Compiled Policy
                   → fail blocks emit-bootstrap if `pre_deployment_eval.required: true`
   │
   ▼
[mint-bootstrap]   Operator: tappass sandbox-spec emit-bootstrap X --count N
                   → N single-use bootstrap URLs, 15-min TTL each
   │
   ▼ (operator hands URL to host owner)
[host enrolls]     Host owner: tappass-host init <name> --enroll-url <url>
                   → new Sandbox row created with a fresh sandbox_id
                   → Policy compiled into Compiled Policy v1 for this sandbox
                   → Runtime providers registered
                   → mTLS exchange completes
                   → Bootstrap URL burned
   │
   ▼
[sandbox runs]     Per the Runtime's recipe of Providers; Compiled Policy live-pushed via Sync
   │
   ▼ (when spec evolves)
[edit]             Operator: tappass sandbox-spec edit X
                   → New version of spec saved
                   → Existing Sandboxes from older spec keep running until rebooted or rebound
                   → New bootstraps from this spec use the new version

Forbidden capabilities — the absolute floor

A sandbox-spec can declare forbidden_capabilities:

forbidden_capabilities: [code_execution, sor_arbitrary_write]

These cannot be lifted by any Cascade override. Lifting them requires editing the spec itself — which is a separately-audited Compliance action. This is what makes sandbox-specs safe by default: even if a downstream operator tries to override, the floor stays.

Engines that operate on Sandbox-specs

Engine	What it does	Status
Spec authoring CLI	`tappass sandbox-spec create / edit / list`	concept (within `tappass-cli`)
Spec evaluator	Runs probe suite against the bound Policy + Runtime	concept (Q4 2026)
Bootstrap minter	Issues N single-use URLs from a spec	concept (within `tappass-cli`)
Quick-start library	Curated specs for common agent shapes	concept (Q4 2026, demand-driven)
Spec versioner	Diff between versions; rollback	concept (Q1 2027)

Surfaces

Persona	Surface	What you do
Operator (CLI)	`tappass sandbox-spec create / edit / list / show / emit-bootstrap`	author and mint specs
Operator (dashboard)	"Sandbox templates" page	visual equivalent of CLI
Compliance	Spec audit view	inspect which specs apply which packs

binds → Policy (rules) + Runtime (recipe)
emits → Bootstrap URLs
produces → Sandbox instances
gated by → Probe suites (pre-deployment evaluation)

Authoritative docs

Topic	File
Vision	governed-agents.md §11 — enrollment flow
Quick-start library	Sandbox card §Quick-starts
Operator CLI surface	operator-cli

Common confusions

Sandbox-spec ≠ Sandbox. A spec is a template; a Sandbox is a running instance produced from one. One spec → many Sandboxes.
Sandbox-spec ≠ Compiled Policy. A spec binds a Policy; the Compiled Policy is what comes out of the policy compiler when the spec is applied to a particular Sandbox identity. Different Sandboxes from the same spec get different Compiled Policies (different identity, different cascade context).
Editing a spec doesn't retroactively change running Sandboxes. Existing Sandboxes keep their Compiled Policy until rebound. New bootstraps from the spec use the latest version. (To force-update existing sandboxes, the operator runs tappass sandbox rebind or rotates the spec via Sync.)

Sandbox-spec

Sandbox-spec

At a glance

What it contains

Sandbox-spec vs. Sandbox

Quick-start sandbox-specs

Lifecycle

Forbidden capabilities — the absolute floor

Engines that operate on Sandbox-specs

Surfaces

Related concepts

Authoritative docs

Common confusions