LLM Gateway (cross-cutting layer) — shipped pointer
LLM Gateway (cross-cutting layer) — shipped pointer
Section titled “LLM Gateway (cross-cutting layer) — shipped pointer”What it does: Already shipped as
tappass/gateway/. This file documents how it consumes the Compiled Policy.
Why a pointer file
Section titled “Why a pointer file”The LLM Gateway is one of two cross-cutting layers in the enforcement plane (per ADR 0001, peer to the MCP Broker). It sees every model call before it reaches the upstream provider, and every tool emission in the response. It's also the most mature TapPass component (already shipped, OpenAI-compat + Anthropic-native + MCP + LiteLLM 100+ providers, 32-step pipeline, capability tokens).
There's no new component to build for the gateway itself. What's new is how the gateway consumes its slice of the Compiled Policy.
How the LLM Gateway consumes the Compiled Policy
Section titled “How the LLM Gateway consumes the Compiled Policy”The Compiled Policy is organized by aspect (network / filesystem / tools / interpreter / budget / compliance) per ADR 0003. The llm-gateway-* providers consume:
| Compiled Policy aspect | What the gateway does with it |
|---|---|
tools.allow / tools.deny | Per-call capability-token scope (which tool emissions the gateway accepts) |
network.allow_domains | Per-call upstream allowlist (which model providers / domains may be reached) |
budget.tokens_per_day / dollars_per_month | Per-org rate / spend caps; circuit-breaker on exceed |
compliance_tags | Tags every call's audit row with the regulations in scope; selects the relevant pipeline detectors |
identity.tier | Trust-tier-driven default scope (observer / worker / standard / full) |
# Gateway runtime config — derived by an llm-gateway-* provider from the Compiled Policyllm_gateway: base_url: https://api.tappass.ai/v1 capability_token: tp_gw_… # ES256, 5-min TTL, JWKS-verified upstream: anthropic # or openai-compat, vertex, litellm scope: tools_allowed: [list_schemas, read_asset, add_table, add_column] tools_denied: [delete_*] schemas_acl: { customers: [read, write], pii_archive: [] } budget: { tokens_per_day: 500000, dollars_per_month: 200 }The agent's LLM client (configured via agent-client-sdk) reads base_url + capability_token and uses them as the upstream's base_url + api_key for every LLM call. The gateway server-side enforces the scope via capability tokens at every chat completion.
What the gateway already does
Section titled “What the gateway already does”| Capability | Source |
|---|---|
OpenAI-compatible POST /v1/chat/completions | shipped |
Anthropic-native POST /v1/messages | shipped |
MCP server gateway/mcp_server.py | shipped |
Tool execution POST /v1/tools/execute | shipped |
| Capability tokens (ES256) + JWKS verify | shipped |
| Provider routing via LiteLLM (100+ providers) | shipped |
| Streaming, circuit breaker | shipped |
| 32-step pipeline (PII / secrets / exfil / scan_output / detect_prompt_injection / …) | shipped |
| Audit hash-chain + ES256 mandates | shipped |
What this concept adds at the gateway
Section titled “What this concept adds at the gateway”| Addition | Component |
|---|---|
| Per-call capability scoping derived from Compiled Policy (rather than per-tool-key) | covered by policy-to-sandbox-config-builder writing the gateway provider's slice |
| Token rotation on Compiled Policy change | covered by live-policy-push-channel and the gateway's existing token-revocation API |
No new gateway code. Just new gateway configuration derived from the Compiled Policy.