Skip to content

Codebase tour

Skip the instinct to grep through the repo on day one. Read these four files in order. Together they give you the full shape of the request path; every other file in tappass/ either supports one of these or is off the hot path.

Before you start, open Flows → Life of a governed LLM call in another tab. That's the trace you'll be following line by line.

flowchart LR
  F1[1. api/main.py
wiring] F2[2. pipeline/context.py
per-request state] F3[3. pipeline/runner.py
the 49-step loop] F4[4. audit/writer.py
hash-chained trail] F1 --> F2 --> F3 --> F4

1. api/main.py — how a request even gets here (~8 min)

Section titled “1. api/main.py — how a request even gets here (~8 min)”

The question this file answers: what happens between a TCP connect and a route handler seeing the request?

Read top to bottom. Stop at every app.add_middleware(…) call and click through to the middleware implementation. The order matters — each middleware wraps the next, outermost-first. You'll see:

  • SecurityHeadersMiddleware — adds CSP / HSTS / X-Frame-Options.
  • RequestIDMiddleware — generates a request ID every subsequent log line correlates on.
  • RequestSizeLimitMiddleware — 413s giant bodies before they eat memory.
  • RateLimitMiddleware — Redis-backed counters per tp_ key or IP.
  • AuthMiddleware — resolves tp_ / session JWT / SPIFFE; sets request.state.account. If auth fails, everything else is skipped.
  • TelemetryContextMiddleware — binds the resolved user / tenant into Sentry + PostHog context for this request.
  • SignatureMiddleware — verifies signed payloads on a subset of routes.

At the bottom you'll see app.include_router(...) — that's the routing table. Every /api/* prefix maps to a module under api/routes/.

Stop and verify understanding: trace one real POST /v1/messages request — which middleware layers does it pass through, in what order, and at which layer does it get its Decision? (Answer: AuthMiddleware → route handler → gateway_router — the pipeline runs inside the route handler, not in middleware.)

2. pipeline/context.py — the thing everything reads and writes (~5 min)

Section titled “2. pipeline/context.py — the thing everything reads and writes (~5 min)”

The question this file answers: how does state flow through a request without globals?

PipelineContext is created once per governed call and passed through every step. Look at its @dataclass fields:

  • agent, session, org_id — identity.
  • payload — the original request body (and, as steps rewrite it, the modified one).
  • detections: list[Detection] — the running finding list.
  • tokens_consumed, cost_cents — running counters.
  • audit_buffer: list[AuditEvent] — events queued for writing.

The key insight: if you're tempted to add a module global or use request.state for cross-step state, add a field to PipelineContext instead. Every existing step is an example.

Stop and verify: open pipeline/steps/detect_pii.py (any pii step). Find where it reads ctx.payload and where it appends to ctx.detections. That pattern is every step.

3. pipeline/runner.py — the heart of the platform (~10 min)

Section titled “3. pipeline/runner.py — the heart of the platform (~10 min)”

The question this file answers: how do those 49 steps actually run, and what do the phase hooks do?

Read this alongside Hooks. Focus on:

  • The main loop (around line 39) — walks the compiled step list, calls step.execute(ctx), appends results.
  • before_pipeline — the pre-scan step that runs the text scanner once and caches findings on ctx; without this every detection step would re-scan the same text.
  • after_classify — the retroactive escalation point (a CONFIDENTIAL classification discovered mid-pipeline can upgrade earlier continue decisions to block).
  • after_pipeline — session metrics + decision-tree build.

Then zoom out: why only three phase hooks, not pre/post per step? Because cross-step state goes on the context — a per-step hook would duplicate what PipelineContext already gives you.

Stop and verify: what happens if step.execute() raises? Trace the try/except (there is one — find it). The runner catches, records a step_failed audit event, and the policy decides whether to fail closed or continue.

4. audit/writer.py — the one place we don't fail silent (~7 min)

Section titled “4. audit/writer.py — the one place we don't fail silent (~7 min)”

The question this file answers: what makes the audit trail tamper-evident, and why do we treat writes to this module differently from every other DB write?

Key things to notice:

  • Every AuditEvent is hash-chainedhash = H(prev_hash || payload). Tampering with any event breaks every subsequent hash.
  • Every event is Ed25519-signed with the audit chain key (in Secret Manager). You can verify the chain offline with the public key.
  • The writer retries on DB failure rather than swallowing errors. This is the one exception to the rule "best-effort telemetry never blocks the request" — losing audit events is worse than serving a slow request.
  • Chain integrity is re-verified every 4 hours by observability/background.py:integrity_check_worker.

Stop and verify: look at what happens in the retry path. How many retries? What's the backoff? Where does the 503 to the caller come from if retries exhaust? (That 503 is intentional — see Security → Fail-closed boundaries.)

Now that the request path makes sense, branch out by what you're working on:

You're working on…Start reading
A new detection steppipeline/steps/ — pick one similar to yours, copy its shape
Provider-specific bugsgateway/anthropic.py or gateway/openai.py — one file per provider
Auth / identity changesidentity/middleware.py + the sub-module for the auth type
OPA policy tweaksconfig/policies/rego/ + policy/engine.py (thin wrapper)
Vault / key rotationvault/crypto.py + vault/providers/postgres.py
Frontendfrontend/ — see Frontend architecture
Anything OEMconfig/providers/, vault/providers/, pipeline/backends/
Terminal window
# Every pipeline step, in execution order (order lives in the registry)
grep -rn "^class .*Step" tappass/pipeline/steps/
# Every route prefix
grep -rn "^router = APIRouter\|include_router" tappass/api/
# Every domain object (see also Architecture → Domain objects)
grep -rn "^class Agent\|^class Pipeline\|^class Decision\|^class Mandate" tappass/
# Every place we write an AuditEvent
grep -rn "audit.record(AuditEvent(" tappass/
  • Thin routes, fat domain. Route handlers in api/routes/ should be 10–30 lines — unpack the request, call into domain/ or pipeline/, format the response. Business rules never live in a route.
  • PipelineContext for request state. Not request.state, not module globals, not closures.
  • Audit every decision. If your change touches the request path and doesn't emit an AuditEvent, your change is incomplete.
  • Fail closed by default. Explicit comments when we intentionally fail open.
  • No bypasses for "just this one test". If it's worth fixing, it's worth fixing under the same auth rules production enforces.