Skip to content

Architecture overview

The public architecture page is correct but it’s the customer-facing version. This page is the internal one — the mental model you need to debug, extend, or explain the system to a colleague.

TapPass is a governed proxy in front of AI agents — everything flows through a configurable pipeline that detects, blocks, redacts, and audits, then out to a real LLM provider.

If you understand that sentence, the rest is detail.

TapPass has three logical planes that run in the same process but should be understood separately:

┌──────────────────────────────────────────────────┐
│ DATA PLANE │
│ /v1/chat/completions, /v1/messages │
│ Runs the pipeline, calls providers, streams back│
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ CONTROL PLANE │
│ /api/v1/admin/*, /audit/*, /health/* │
│ Manages agents, policies, keys, capability tokens│
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ OBSERVABILITY PLANE │
│ /export/*, webhooks, audit stream │
│ Ships events out to customer SIEMs │
└──────────────────────────────────────────────────┘

They share the database but are scoped by different auth: data plane uses customer tp_ keys, control plane uses session auth (SSO), observability uses API tokens.

A single /v1/chat/completions call:

1. Edge (Cloud Run) terminates TLS
2. Gateway authenticates the tp_ key (identity/api_key.py)
3. Request is matched to an agent + tenant (registry.py)
4. Pipeline context is built (pipeline/context.py)
5. Pre-LLM steps run in order (pipeline/engine.py)
- each returns Detection[]; policy decides
6. Credentials resolved from vault (vault/*)
7. Provider client constructed (gateway/<provider>.py)
8. Streaming response from provider (httpx or SSE)
9. Post-LLM steps run on chunks (pipeline/engine.py)
10. Audit event written (hashed, signed) (audit/writer.py)
11. Response streams back to the agent

Steps 5-9 run in a PipelineContext that carries: the agent, the tenant, detected findings so far, the original + modified payload, and counters (tokens, cost).

  • Stateless per request. No shared mutable state between requests. Lets us horizontally scale on Cloud Run.
  • Pipeline is data, not code. Steps are registered in a config file, not a Python list. Customer-specific pipelines are YAML, not forks.
  • Policy is separate from detection. A step detects; policy decides. Same step set serves different customers with different policies.
  • Audit-first. Every decision must land in the audit trail. If you touch the request path and your change doesn’t emit an audit event, it’s incomplete.

Repo: tappass/tappass

src/tappass/
├── api/ # FastAPI routes — THIN. No business logic.
│ ├── v1/ # Customer-facing data plane
│ ├── admin/ # Control plane
│ └── health/ # Liveness, readiness, integrity
├── gateway/ # Provider clients (openai.py, anthropic.py, …)
├── pipeline/
│ ├── engine.py # The loop that runs the 49 steps
│ ├── steps/ # One file per step
│ ├── backends/ # Detection backends (Llama Guard, LLM Guard, …)
│ └── context.py # PipelineContext — per-request state
├── policy/
│ ├── engine.py # Thin wrapper around OPA
│ └── rules/ # Embedded default rules (per-customer live in config/)
├── vault/
│ ├── protocol.py # VaultProvider protocol
│ └── providers/ # Postgres, file, future: HashiCorp/AWS/Azure/GCP
├── identity/
│ ├── api_key.py # tp_ authentication
│ ├── sso.py # OIDC for humans
│ ├── saml.py # SAML for humans
│ └── spiffe/ # Workload identity
├── audit/
│ ├── writer.py # Hash chain + Ed25519 sign
│ ├── integrity.py # Chain verification
│ └── archive.py # Daily cold-storage export
├── observability/
│ ├── export/ # Splunk, Azure Sentinel, webhooks
│ └── webhooks.py # Event-triggered webhooks
├── domain/ # Pure Python; no I/O
├── config/ # Pydantic Settings
└── main.py # Composition root

When in doubt, trace a request top-to-bottom through the api → pipeline → gateway → audit chain.

React 19 + TypeScript + Tailwind + Radix. Lives at tappass/frontend/ inside the tappass/ repo (not a separate repo). Talks to the control plane via /api/v1/*. See Frontend architecture.

  • No shared monolith. SDK is HTTP-only. Fracturing it would hurt customers.
  • No agent SDK state on the server. Multi-turn conversation lives client-side.
  • No blocking waits on external systems in the hot path. Detection backends and policy must have local-first paths or fail-open-to-audit.
  • No caching of customer data. Redaction findings are computed fresh every call.
  • No “fix in the API layer”. Business rules belong in domain/pipeline, not in routes.