LLM Gateway (cross-cutting layer) — shipped pointer

What it does: Already shipped as tappass/gateway/. This file documents how it consumes the Compiled Policy.

Why a pointer file

The LLM Gateway is one of two cross-cutting layers in the enforcement plane (per ADR 0001, peer to the MCP Broker). It sees every model call before it reaches the upstream provider, and every tool emission in the response. It's also the most mature TapPass component (already shipped, OpenAI-compat + Anthropic-native + MCP + LiteLLM 100+ providers, 32-step pipeline, capability tokens).

There's no new component to build for the gateway itself. What's new is how the gateway consumes its slice of the Compiled Policy.

How the LLM Gateway consumes the Compiled Policy

The Compiled Policy is organized by aspect (network / filesystem / tools / interpreter / budget / compliance) per ADR 0003. The llm-gateway-* providers consume:

Compiled Policy aspect	What the gateway does with it
`tools.allow` / `tools.deny`	Per-call capability-token scope (which tool emissions the gateway accepts)
`network.allow_domains`	Per-call upstream allowlist (which model providers / domains may be reached)
`budget.tokens_per_day` / `dollars_per_month`	Per-org rate / spend caps; circuit-breaker on exceed
`compliance_tags`	Tags every call's audit row with the regulations in scope; selects the relevant pipeline detectors
`identity.tier`	Trust-tier-driven default scope (observer / worker / standard / full)

# Gateway runtime config — derived by an llm-gateway-* provider from the Compiled Policy
llm_gateway:
  base_url: https://api.tappass.ai/v1
  capability_token: tp_gw_…           # ES256, 5-min TTL, JWKS-verified
  upstream: anthropic                  # or openai-compat, vertex, litellm
  scope:
    tools_allowed:  [list_schemas, read_asset, add_table, add_column]
    tools_denied:   [delete_*]
    schemas_acl:    { customers: [read, write], pii_archive: [] }
  budget: { tokens_per_day: 500000, dollars_per_month: 200 }

The agent's LLM client (configured via agent-client-sdk) reads base_url + capability_token and uses them as the upstream's base_url + api_key for every LLM call. The gateway server-side enforces the scope via capability tokens at every chat completion.

What the gateway already does

Capability	Source
OpenAI-compatible `POST /v1/chat/completions`	shipped
Anthropic-native `POST /v1/messages`	shipped
MCP server `gateway/mcp_server.py`	shipped
Tool execution `POST /v1/tools/execute`	shipped
Capability tokens (ES256) + JWKS verify	shipped
Provider routing via LiteLLM (100+ providers)	shipped
Streaming, circuit breaker	shipped
32-step pipeline (PII / secrets / exfil / scan_output / detect_prompt_injection / …)	shipped
Audit hash-chain + ES256 mandates	shipped

What this concept adds at the gateway

Addition	Component
Per-call capability scoping derived from Compiled Policy (rather than per-tool-key)	covered by policy-to-sandbox-config-builder writing the gateway provider's slice
Token rotation on Compiled Policy change	covered by live-policy-push-channel and the gateway's existing token-revocation API

No new gateway code. Just new gateway configuration derived from the Compiled Policy.