LLM provider

An LLM provider is an HTTP client to one LLM API.

It speaks the API's wire protocol (Anthropic Messages, OpenAI Chat Completions, Bedrock InvokeModel, …), handles auth, streaming, retries, and surfaces a uniform call surface to the gateway pipeline.

An LLM provider executes. It does not enforce. Enforcement happens around it — in the Pipeline before/during/after each call.

Note on naming. The earlier Provider card conflated this concept (HTTP client) with the Policy provider concept (translator from Compiled Policy → runtime config). They are now split into two cards.

At a glance


Takes	a normalized chat/completion request from the gateway
Outputs	a normalized response stream (chunks, tool-use blocks, errors)
Speaks	one LLM vendor's wire protocol
Where it lives	`tappass/gateway/<vendor>.py` (anthropic.py, openai.py, …)
Status	shipped (Anthropic, OpenAI, LiteLLM); planned (Vertex / Gemini, direct Bedrock)

What each LLM provider does

LLM provider	API	Notes
`anthropic`	Anthropic Messages	streaming SSE; tool use; thinking blocks; cache_control
`openai`	OpenAI Chat Completions	streaming; function calling; reasoning models
`litellm`	100+ providers via LiteLLM proxy	covers long tail; already shipped
`vertex`	Google Vertex AI / Gemini	planned
`bedrock`	AWS Bedrock InvokeModel	planned (direct, beyond LiteLLM)
`azure-openai`	Azure OpenAI	via `openai` client with base-URL override

Adding a new vendor is mostly: implement the wire protocol, normalize the response shape, register the client.

What it is, concretely

When a customer agent calls TapPass with POST /v1/messages, the gateway:

Authenticates the tp_ key, identifies the agent + tenant.
Resolves the customer's chosen vendor (and credentials from the Vault).
Selects the matching LLM provider module (gateway/anthropic.py, …).
Runs before-the-call Pipeline steps.
Hands the request to the LLM provider, which calls the vendor's API and streams chunks back.
Runs after-the-call Pipeline steps on each chunk.
Returns the streamed response to the agent.

The LLM provider is the only component that talks to the vendor. Everything else operates on normalized request/response shapes.

Why this concept exists separately from Policy provider

Both concepts were originally collapsed into a single "Provider" card. They are structurally different and shipping cadences differ:

	Policy provider	LLM provider
Concept	translator (Compiled Policy → target config)	HTTP client (TapPass → LLM API)
Pure function?	yes — deterministic, no side effects in compile	no — makes outbound HTTP calls
Where it runs	host runtime / control plane on policy update	server-side, on every governed LLM call
Status	concept (most), partial (openshell, gateway)	shipped (Anthropic, OpenAI, LiteLLM)
Adding one	~2-week translator project	~1-week wire-protocol integration

The LLM gateway internally uses both:

A Policy provider (anthropic-gateway) translates the Compiled Policy into gateway configuration (budget caps, base-URL redirect, redaction policy).
An LLM provider (anthropic) executes the actual API calls under that configuration.

Lifecycle

[implement]   Wire-protocol client written; response normalized to TapPass shape
   ↓
[register]    Module registered under gateway/<vendor>.py; client picked by vendor name
   ↓
[credential]  Customer adds vendor credentials to Vault (BYOK or managed)
   ↓
[serve]       Gateway routes calls through this LLM provider for matching agents
   ↓
[evolve]      Vendor API versions bumped; client follows; old versions kept until customers migrate

Engines that operate on LLM providers

Engine	What it does	Status
Gateway pipeline	Runs before/during/after steps around the LLM call	shipped
Vault	Resolves vendor credentials per tenant	shipped
Provider router	Picks the right LLM provider per agent based on customer config	shipped
Cost telemetry	Records token + latency + cost per call into Metering	partial

Surfaces

Persona	Surface	What you do
Operator	Admin UI → Settings → LLM Providers	Configure which vendors are available + BYOK credentials
Operator	`tappass llm-provider list / test <id>`	Verify connectivity / credentials
Customer agent	`POST /v1/messages` (or `/v1/chat/completions`)	Issue a governed LLM call; gateway picks the right LLM provider

distinct from ↔ Policy provider — translator, not client
wraps ↑ Pipeline — every call runs through the engine
uses → Vault — for vendor credentials (BYOK or managed)
emits → Audit log — every call lands as an audit row
feeds → Metering — cost / token / latency rollups

Status snapshot

LLM provider	Status	Notes
`anthropic`	shipped	streaming, tool-use, thinking, prompt caching
`openai`	shipped	streaming, function calling, reasoning models
`litellm`	shipped	100+ vendors via LiteLLM proxy
`vertex` (Gemini)	planned	direct Vertex client
`bedrock`	planned	direct AWS Bedrock client (current path is via LiteLLM)
`azure-openai`	shipped	via `openai` client + base-URL override

LLM provider

LLM provider

At a glance

What each LLM provider does

What it is, concretely

Why this concept exists separately from Policy provider

Lifecycle

Engines that operate on LLM providers

Surfaces

Related concepts

Status snapshot