Skip to content

LLM Gateway (cross-cutting layer) — shipped pointer

LLM Gateway (cross-cutting layer) — shipped pointer

Section titled “LLM Gateway (cross-cutting layer) — shipped pointer”

What it does: Already shipped as tappass/gateway/. This file documents how it consumes the Compiled Policy.

The LLM Gateway is one of two cross-cutting layers in the enforcement plane (per ADR 0001, peer to the MCP Broker). It sees every model call before it reaches the upstream provider, and every tool emission in the response. It's also the most mature TapPass component (already shipped, OpenAI-compat + Anthropic-native + MCP + LiteLLM 100+ providers, 32-step pipeline, capability tokens).

There's no new component to build for the gateway itself. What's new is how the gateway consumes its slice of the Compiled Policy.

How the LLM Gateway consumes the Compiled Policy

Section titled “How the LLM Gateway consumes the Compiled Policy”

The Compiled Policy is organized by aspect (network / filesystem / tools / interpreter / budget / compliance) per ADR 0003. The llm-gateway-* providers consume:

Compiled Policy aspectWhat the gateway does with it
tools.allow / tools.denyPer-call capability-token scope (which tool emissions the gateway accepts)
network.allow_domainsPer-call upstream allowlist (which model providers / domains may be reached)
budget.tokens_per_day / dollars_per_monthPer-org rate / spend caps; circuit-breaker on exceed
compliance_tagsTags every call's audit row with the regulations in scope; selects the relevant pipeline detectors
identity.tierTrust-tier-driven default scope (observer / worker / standard / full)
# Gateway runtime config — derived by an llm-gateway-* provider from the Compiled Policy
llm_gateway:
base_url: https://api.tappass.ai/v1
capability_token: tp_gw_… # ES256, 5-min TTL, JWKS-verified
upstream: anthropic # or openai-compat, vertex, litellm
scope:
tools_allowed: [list_schemas, read_asset, add_table, add_column]
tools_denied: [delete_*]
schemas_acl: { customers: [read, write], pii_archive: [] }
budget: { tokens_per_day: 500000, dollars_per_month: 200 }

The agent's LLM client (configured via agent-client-sdk) reads base_url + capability_token and uses them as the upstream's base_url + api_key for every LLM call. The gateway server-side enforces the scope via capability tokens at every chat completion.

CapabilitySource
OpenAI-compatible POST /v1/chat/completionsshipped
Anthropic-native POST /v1/messagesshipped
MCP server gateway/mcp_server.pyshipped
Tool execution POST /v1/tools/executeshipped
Capability tokens (ES256) + JWKS verifyshipped
Provider routing via LiteLLM (100+ providers)shipped
Streaming, circuit breakershipped
32-step pipeline (PII / secrets / exfil / scan_output / detect_prompt_injection / …)shipped
Audit hash-chain + ES256 mandatesshipped
AdditionComponent
Per-call capability scoping derived from Compiled Policy (rather than per-tool-key)covered by policy-to-sandbox-config-builder writing the gateway provider's slice
Token rotation on Compiled Policy changecovered by live-policy-push-channel and the gateway's existing token-revocation API

No new gateway code. Just new gateway configuration derived from the Compiled Policy.